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(57) Abstract: A method of identifying putative naturally occurring anti sense transcripts is provided. The method is effected by 
(a) computationally aligning a first database including sense-oriented polynucleotide sequences with a second database including 
expressed polynucleotide sequences; and (b) identifying expressed polynucleotide sequences from the second database being capa- 
ble of forming a duplex with at least one sense-oriented polynucleotide sequence of the first database, thereby identifying putative 
naturally occurring anlisensc transcripts. 
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METHODS AND SYSTEMS FOR IDENTIFYING NATURALLY 
OCCURRING ANTISENSE TRANSCRIPTS AND METHODS, KITS AND 

ARRAYS UTILIZING SAME 

5 BACKGROUND AND FIELD OF THE INVENTION 

The present invention relates to the field of naturally occurring, antisense 
transcripts. More particularly, the present invention relates to methods of 
identifying naturally occurring antisense transcripts, databases storing 
polynucleotide sequences encoding identified naturally occurring antisense 

10 transcripts, oligonucleotides derived therefrom and methods and kits utilizing 
same. 

Naturally occurring antisense RNA transcripts are endogenous 
transcripts, which exhibit complementarity to sense transcripts of which are 
typically of a known function. It has been established that these endogenous 
15 antisense transcripts play an important role in regulating prokaryotic gene 
expression and are increasingly implicated as involved in eukaryotic gene 
regulation. 

Cz^-encoded antisense transcripts are encoded by the same locus as the 
sense transcripts and are transcribed from strand of DNA opposite to that 

20 encoding the sense transcript; as such, cis encoded antisense transcripts are 
typically completely complementary with a portion of the sense transcript. 

Tra/zs-encoded antisense transcripts are by contrast, transcripts, which 
are encoded on a different locus and as such, may display only partial 
complementarity with a sense transcript. 

25 Natural antisense RNAs were first described in prokaryote studies, 

which suggested that such transcripts play a role in gene expression regulation. 
Prokaryotic antisense transcripts are widely distributed and are involved in the 
control of numerous biological functions including transposition, plasmid 
replication, incompatibility and conjugation. In prokaryotes, antisense 

30 transcripts are typically involved in down-regulation of sense transcript 
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expression, although involvement in positive regulation was also suggested 
[reviewed in Wagner EG. and Simons RW. (1994) Annu. Rev. Microbiol. 
48:713-742]. 

The first example of transcription from both strands of eukaryotic DNA 
5 was illustrated in human and mouse mitochondrial genes [Anderson S. et al. 
(1981) Nature 290:457-465 and Bibb MJ. et al. (1981) Cell 26:167-180]. Since 
then, examples of antisense transcripts have been documented in a variety of 
organisms including viruses, slime molds, insects, amphibians and birds as well 
as mammals. It is thought that these antisense RNAs are involved in extremely 

10 diverse biological functions, such as, hormonal response, control of 
proliferation, development, structure, viral replication and others. Some 
antisense RNAs are conserved between species suggesting that these antisense 
RNAs are not fortuitous but rather play an important role in gene expression 
regulation [Kidny MS. et al. (1987) Mol. Cell Biol. 7:2857-2862, Nepveu A. 

15 and Marcu KB. (1986) EMBO J. 5:2859-2865 and Bentley DL. et al. (1986) 
Nature 321:702-706]. 

Antisense transcripts can also encode proteins. Examples for protein 
encoding antisense transcripts include rev-ErbAx [Lazar MA. (1989) Mol. Cell. 
Biol. 9:1 128-1 136], gfg [Kimelman D. et al. (1989) Cell 59:687-696] and n-cym 

20 [Armstrong BC. et al. (1992) Cell Growth Differ, 3:385-390]. Such antisense 
transcripts typically include a distinct open reading frame (ORP) and 
polyadenylation signal for cytoplasm transportation. 

However, it is believed that most antisense transcripts play a role in gene 
expression regulation. This assumption is mostly based on spatial and/or 

25 temporal distributions of sense and antisense transcripts. Indeed, tissue 
distribution studies suggest that high levels of sense and antisense transcripts 
rarely occur together, as was exemplified for the dopa decarboxylase transcripts 
in Drosophila [Spencer CA. et al. (1986) Nature 322:279-281], Additional 
studies demonstrated that changes in sense gene expression correlate with 

30 presence of antisense RNA. Furthermore, an inverse relationship between 
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levels of accumulation of sense and antisense transcripts such as has been 
reported for al (I) collagen transcripts in chondrocytes under chemotherapy has 
also been reported [Farrell CM. And Lukens LN. (1995) J. Biol. Chem. 
270:3400-3408]. However, it will be appreciated that mutual expression of 
5 sense and their corresponding antisense transcripts is also reported and may 
involve a different mechanism of regulation. 

Evidence for involvement of antisense-mediated gene regulation in the 
development of pathologies has also been presented. For example, endogenous 
antisense transcripts may be involved in regulation of the expression levels of 
10 the tumor suppressor gene WT1 observed in Wilm's tumors [Eccles MR. et al. 
(1994) Oncogene 9:2059-2063]. 

Natural antisense regulation of gene expression can be effected via one 
of several mechanisms. 

Nuclear regulation 

15 Nuclear regulation can be effected via several gene-processing pathways 

[reviewed in Vanhee-Brosollet C. and Vaquero C. (1998) Gene 211:1-9] 

dsRNA-mediated DNA metltylation - complementation between 
endogenous sense transcripts and antisense transcripts of sequences as short as 
30 bp may initiate DNA-methylation, a well-established phenomenon in a 

20 number of organisms [Sharp A. (2001) Genes Dev. 15:485-490]. Methylation 
can be directed to different portions of an encoding region of the gene or to the 
promoter region. DNA methylation results in complete suppression of 
transcription probably by recruitment of histone deacetylases. 

Transcriptional regulation — in which case antisense transcription 

25 hampers sense transcription. Such interference may involve the collision of 
two transcription complexes. Alternatively, interference may result from 
competition on an essential rate limiting transcription factor resulting in 
premature termination or in reduced elongation of transcription, the transcripts 
with the highest rate of transcription being predominant. 
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Post-transcriptional nuclear regulation - involves antisense 
intervention of either maturation and/or transport of the sense transcript to the 
cytoplasm. Alternatively, antisense transcripts displaying similar structural 
features to sense transcripts can bind proteins expected to interact with their 
5 sense counterparts, thereby depriving sense messengers from proteins necessary 
for their function. 

Cytoplasmic regulation 

Messenger stability —double stranded RNA may affect messenger 
stability via "RNA interference", which involves short segments of double 

10 stranded RNA (dsRNA) homologous in sequence to the silenced gene. These 
undersized segments, which are generated by a ribonuclease III cleavage of 
longer dsRNAs, can guide a single stranded target mRNA, via base pairing, to a 
multisubunit complex which participates in the degradation of the target 
mRNA. Alternatively, messenger stability may be affected by RNA 

15 degradation, which is mediated by double stranded RNA-directed Rnases. 

Translation - masking the 3' untranslated region (UTR) and the polyA 
tail of the sense transcript is believed to modulate translation efficiency 
probably via direct or indirect interaction between 3' -proximal elements and 
upstream sequences or structures [reviewed in Jackson RJ. And Standart N. 

20 (1990) Cell 62:15-24]. 

Realizing the fundamental role antisense transcripts play in regulating 
sense transcription, stability and function, resulted in a number of attempts to 
systematically identify natural antisense transcripts. Accordingly, differential 
approaches were taken for exploring non-coding antisense RNA transcripts and 

25 antisense transcripts including an ORF. Although the latter carries ORF 
consensus parameters, uncovering antisense data from general sequence 
databases has proven to be a complicated task, as many of these sequences 
include an evolutionary conserved secondary structure rather than a conserved 
primary sequence, therefore primary sequence alignment methods are often not 
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very effective. Indeed, only a few attempts have been tried to date with only 
limited success. 

Maziel's group [Chen JH. et aJ. (1990) Comput. Applic. Biosci. 6:7-18 
and Le SY. et al (1990) Human Genome Initiative and DNA Recombination 
5 Vol. 1:127-136] has experimented with methods that look for regions of a 
genome with predicted RNA structures that are significantly more stable 
thermodynamically than random sequence of the same base composition. 
Although this approach detected a few highly structured non-coding RNAs, as 
well as few c/5-regulatory structures, it appears that it is of limited use for large- 
l o scale applications. 

Another approach examined coding dense genomes, having suspicious- 
looking large regions with little or no coding potential termed "gray holes' 5 
[Olivas WM. et al. (1997) Nucleic acids Res. 25:4619-4625]. Fifty nine gray 
holes were tested in the yeast genome. Northern analysis detected distinct 
15 transcripts from 15 of the gray holes. Only one transcript appeared to be a non- 
coding antisense transcript illustrating the low efficiency of this method. 

There is thus a widely recognized need for, and it would be highly 
advantageous to have, methods of systematically identifying novel naturally 
occurring antisense molecules and methods of artificially generating and using 
20 same for detecting, quantifying and/or regulating sense transcripts, such as for 
example, mRNA transcripts associated with a pathological state. 

SUMMARY OF THE INVENTION 

According to one aspect of the present invention there is provided a 
25 method of identifying putative naturally occurring antisense transcripts, the 
method comprising: (a) computationally aligning a first database including 
sense-oriented polynucleotide sequences with a second database including 
expressed polynucleotide sequences; and (b) identifying expressed 
polynucleotide sequences from the second database being capable of forming a 
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duplex with at least one sense-oriented polynucleotide sequence of the first 
database, thereby identifying putative naturally occurring antisense transcripts. 

According to another aspect of the present invention there is provided a 
kit for quantifying at least one mRNA transcript of interest, the kit comprising 
5 at least one oligonucleotide being designed and configured so as to be 
complementary to a sequence region of the mRNA transcript of interest, the 
sequence region not being complementary with a naturally occurring antisense 
transcript. 

According to yet another aspect of the present invention there is 

10 provided a kit for quantifying at least one mRNA transcript of interest, the kit 
comprising at least one pair of oligonucleotides including a first 
oligonucleotide capable of binding the at least one mRNA transcript of interest 
and a second oligonucleotide being capable of binding a naturally occurring 
antisense transcript complementary to the mRNA of interest. 

15 According to still another aspect of the present invention there is 

provided a method of designing artificial antisense transcripts, the method 
comprising: (a) providing a database of naturally occurring antisense 
transcripts; (b) extracting from the database criteria governing structure and/or 
function of the naturally occurring antisense transcripts; and (c) designing the 

20 artificial antisense transcripts according to the criteria. 

According to further features in preferred embodiments of the invention 
described below the criteria governing structure and/or function of the naturally 
occurring antisense transcripts are selected from the group consisting of 
antisense length, complementarity length, complementarity position, intron 

25 molecules, alternative splicing sites, tissue specificity, pathological abundance, 
chromosomal mapping, open reading frames, promoters, hairpin structures, 
helix structures, stem and loops, pseudoknots and tertiary interactions, 
guanidine and/or cytosine content, guanidine tandems, adenosine content, 
thermodynamic criteria, KNA duplex melting point, RNA modifications, 
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protein-binding motifs, palindromic sequence and predicted single stranded and 
double stranded regions. 

According to an additional aspect of the present invention there is 
provided a computer readable storage medium comprising a database including 
5 a plurality of sequences, wherein each sequence is of a naturally occurring 
antisense transcript. 

According to still further features in the described preferred 
embodiments the database further includes information pertaining to each 
sequence of the naturally occurring antisense transcripts, the information is 

io selected from the group consisting of related sense gene, antisense length, 
complementarity length, complementarity position, intron molecules, 
alternative splicing sites, tissue specificity, pathological abundance, 
chromosomal mapping, open reading frames, promoters, hairpin structures, 
helix structures, stem and loops, pseudoknots and tertiary interactions, 

15 guanidine and/or cytosine content, guanidine tandems, adenosine content, 
thermodynamic criteria, RNA duplex melting point, RNA modifications, 
protein-binding motifs, palindromic sequence and predicted single stranded and 
double stranded regions. 

According to still further features in the described preferred 

20 embodiments the database further includes information pertaining to generation 
of the database and potential uses of the database. 

According to yet an additional aspect of the present invention there is 
provided a method of generating a database of naturally occurring antisense 
transcripts, the method comprising: (a) computationally aligning a first database 

25 including sense-oriented polynucleotide sequences with a second database 
including expressed polynucleotide sequences; (b) identifying expressed 
polynucleotide sequences from the second database being capable of forming a 
duplex with at least one sense-oriented polynucleotide sequence of the first 
database so as to identify putative naturally occurring antisense transcripts; and 

30 (c) storing sequence information of the identified naturally occurring antisense 
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transcripts, thereby generating the database of the naturally occurring antisense 
transcripts. 

According to still an additional aspect of the present invention there is 
provided a system for generating a database of a plurality of putative naturally 
5 occurring antisense transcripts, the system comprising a processing unit, the 
processing unit executing a software application configured for: (a) 
computationally aligning a first database including sense-oriented 
polynucleotide sequences with a second database including expressed 
polynucleotide sequences; and (b) identifying expressed polynucleotide 
io sequences from the second database being capable of forming a duplex with at 
least one sense-oriented polynucleotide sequence of the first database. 

According to a further aspect of the present invention there is provided a 
method of identifying putative naturally occurring antisense transcripts, the 
method comprising screening a database of expressed polynucleotides 
15 sequences according to at least one sequence criterion, the at least one 
sequence criterion being selected to identify putative naturally occurring 
antisense transcripts. 

According to yet a further aspect of the present invention there is 
provided A method of quantifying at least one mRNA of interest in a biological 
20 sample, the method comprising: (a) contacting the biological sample with at 
least one oligonucleotide capable of binding with the at least one mRNA of 
interest, wherein the at least one oligonucleotide is designed and configured so 
as to be complementary to a sequence region of the mRNA transcript of 
interest, the sequence region not being complementary with a naturally 
25 occurring antisense transcript; and (b) detecting a level of binding between the 
at least one mRNA of interest and the at least one oligonucleotide to thereby 
quantify the at least one mRNA of interest in the biological sample- 
According to still a further aspect of the present invention there is 
provided a method of quantifying the expression potential of at least one 
30 mRNA of interest in a biological sample, the method comprising: (a) contacting 



BNSDOCID: <WO. O3O46220A1_l„> 



WO 03/046220 PCT/I L02/00904 

9 

the biological sample with at least one pair of oligonucleotides including a first 
oligonucleotide capable of binding the at least one mRNA of interest and a 
second oligonucleotide being capable of binding a naturally occurring antisense 
transcript complementary to the mRNA of interest; and (b) detecting a level of 
5 binding between the at least one mRNA of interest and the first oligonucleotide 
and a level of binding between the naturally occurring antisense transcript 
complementary to the mRNA of interest and the second oligonucleotide to 
thereby quantify the expression potential of the at least one mRNA of interest in 
the biological sample. 

10 According to other aspect of the present invention there is provided a 

method of quantifying at least one naturally occurring antisense transcript of 
interest in a biological sample, the method comprising: (a) contacting the 
biological sample with at least one oligonucleotide capable of binding with the 
at least one naturally occurring antisense transcript of interest, wherein the at 

15 least one oligonucleotide is designed and configured so as to be complementary 
to a sequence region of the naturally occurring antisense transcript of interest, 
the sequence region not being complementary with a naturally occurring 
mRNA transcript; and (b) detecting a level of binding between the at least one 
naturally occurring antisense transcript of interest and the at least one 

20 oligonucleotide to thereby quantify the at least one naturally occurring antisense 
transcript of interest in the biological sample. 

According to still further features in the described preferred 
embodiments the first database includes sequences of a type selected from the 
group consisting of genomic sequences, expressed sequence tags, contigs, 

25 intron sequences, complementary DNA (cDNA) sequences, pre-messenger 
RNA (mRNA) sequences and mRNA sequences. 

According to still further features in the described preferred 
embodiments the second database includes sequences of a type selected from 
the group consisting of expressed sequence tags, contigs, complementary DNA 
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(cDNA) sequences, pre-messenger RNA (rnRNA) sequences and mRNA 
sequences. 

According to still further features in the described preferred 
embodiments an average sequence length of the expressed polynucleotide 
5 sequences of the second database is selected from a range of 0.02 to 0.8 Kb. 

According to still further features in the described preferred 
embodiments the second database is generated by: (i) providing a library of 
expressed polynucleotides; (ii) obtaining sequence information of the expressed 
polynucleotides; (iii) computationally selecting at least a portion of the 
io expressed polynucleotides according to at least one sequence criterion; and (iv) 
storing the sequence information of the at least a portion of the expressed 
polynucleotides thereby generating the second database. 

According to still further features in the described preferred 
embodiments the at least one sequence criterion for computationally selecting 
15 ..- the at least a portion of the expressed polynucleotide is selected from the group 
consisting of sequence length, sequence annotation, sequence information, 
intron splice consensus site, intron sharing, sequence overlap, rare restriction 
site , poly(T) head, poly(A) tail, and poly(A) signal. 

According to still further features in the described preferred 
20 embodiments the step of testing the putative naturally occurring antisense 
transcripts for an ability to form the duplex with the at least one sense oriented 
polynucleotide sequence under physiological conditions. 

According to still further features in the described preferred 
embodiments the method further comprising the step of computationally 
25 testing the putative naturally occurring antisense transcripts according to at 
least one criterion selected from the group consisting of sequence annotation, 
sequence information, intron splice consensus site, intron sharing, sequence 
overlap, rare restriction site , poly(T) head, poly(A) tail, and poly(A) signal. 
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According to still further features in the described preferred 
embodiments a length of the at least one oligonucleotide is selected from a 
range of 1 5-200 nucleotides. 

According to still further features in the described preferred 
5 embodiments the at least one oligonucleotide is a single stranded 
oligonucleotide. 

According to still further features in the described preferred 
embodiments the at least one oligonucleotide is a double stranded 
oligonucleotide. 

10 According to still further features in the described preferred 

embodiments a guanidine and cytosine content of the at least one 
oligonucleotide is at least 25 %. 

According to still further features in the described preferred 
embodiments the at least one oligonucleotide is labeled. 
15 According to still further features in the described preferred 

embodiments the at least one oligonucleotide is attached to a solid substrate. 

According to still further features in the described preferred 
embodiments the solid substrate is configured as a microarray and whereas the 
at least one oligonucleotide includes a plurality of oligonucleotides each 
20 attached to the microarray in a regio-specific manner. 

According to still further features in the described preferred 
embodiments a length of each of the first and second oligonucleotides is 
selected from a range of 15-200 nucleotides. 

According to still further features in the described preferred 
25 embodiments the first and second oligonucleotides are single stranded 
oligonucleotides. 

According to still further features in the described preferred 
embodiments the first and second oligonucleotides are double stranded 
oligonucleotide. 
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According to still further features in the described preferred 
embodiments a guanidine and cytosine content of each of the first and second 
oligonucleotides is at least 25 %. 

According to still further features in the described preferred 
5 embodiments the first and second oligonucleotides are labeled. 

According to still further features in the described preferred 
embodiments the first and second oligonucleotides are attached to a solid 
substrate. 

According to still further features in the described preferred 
io embodiments the solid substrate is configured as a microarray and whereas each 
of the first and second oligonucleotides includes a plurality of oligonucleotides 
each attached to the microarray in a regio-specific manner. 

According to yet other aspect of the present invention there is provided a 
method of identifying a novel drug target, the method comprising: (a) 
15 determining expression level of at least one naturally occurring antisense 
transcript of interest in cells characterized by an abnormal phenotype; and (b) 
comparing the expression level of the at least one naturally occurring antisense 
transcript of interest in the cells characterized by an abnormal phenotype to an 
expression level of the at least one naturally occurring antisense transcript of 
20 interest in cells characterized by a normal phenotype, to thereby identify the 
novel drug target. 

According to still further features in the described preferred 
embodiments the abnormal phenotype of the cells is selected from the group 
consisting of biochemical phenotype, morphological phenotype and nutritional 
25 phenotype. 

According to still further features in the described preferred 
embodiments determining expression level of at least one naturally occurring 
antisense transcript of interest is effected by at least one oligonucleotide 
designed and configured so as to be complementary to a sequence region of the 
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at least one naturally occurring antisense transcript of interest, the sequence 
region not being complementary with a naturally occurring mRNA transcript. 

According to still other aspect of the present invention there is provided 
a method of treating or preventing a disease, condition or syndrome associated 
5 with an upregulation of a naturally occurring antisense transcript 
complementary to a naturally occurring mRNA transcript, the method 
comprising administering a therapeutically effective amount of an agent for 
regulating expression of the naturally occurring antisense transcript. 

According to still further features in the described preferred 
10 embodiments the agent for regulating expression of the naturally occurring 
antisense transcript is at least one oligonucleotide designed and configured so 
as to hybridize to a sequence region of the at least one naturally occurring 
antisense transcript. 

According to still further features in the described preferred 
15 embodiments the at least one oligonucleotide is a ribozyme. 

According to still further features in the described preferred 
embodiments the at least one oligonucleotide is a sense transcript. 

According to a supplementary aspect of the present invention there is 
provided a method of diagnosing a disease, condition or syndrome associated 
20 with a substandard expression ratio of an mRNA of interest over a naturally 
occurring antisense transcript complementary to the mRNA of interest, the 
method comprising: (a) quantifying expression level of the mRNA of interest 
and the naturally occurring antisense transcript complementary to the mRNA of 
interest; (b) calculating the expression ratio of the mRNA of interest over the 
25 naturally occurring antisense transcript complementary to the mRNA of 
interest, thereby diagnosing the disease, condition or syndrome. 

The present invention successfully addresses the shortcomings of the 
presently known configurations by providing a novel approach for identifying 
naturally occurring antisense transcripts, methods of designing artificial 
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antisense transcripts according to information derived therefrom and methods 
and kits using naturally occurring and synthetic antisense transcripts. 

BRIEF DESCRIPTION OF THE DRAWINGS 

5 The invention is herein described, by way of example only, with 

reference to the accompanying drawings. With specific reference now to the 
drawings in detail, it is stressed that the particulars shown are by way of 
example and for purposes of illustrative discussion of the preferred 
embodiments of the present invention only, and are presented in the cause of 

io providing what is believed to be the most useful and readily understood 
description of the principles and conceptual aspects of the invention. In this 
regard, no attempt is made to show structural details of the invention in more 
detail than is necessary for a fundamental understanding of the invention, the 
description taken with the drawings making apparent to those skilled in the art 

15 how the several forms of the invention may be embodied in practice. 
In the drawings: 

FIG- 1 illustrates EST alignment along genomic DNA, generated 
according to the teachings of the present invention. Alignment results identify 
two strand groups of transcripts i.e., sense transcripts and antisense transcripts 
20 with an indicated sequence overlap. 

FIG. 2 illustrates a system designed and configured for generating a 
database of naturally occurring antisense sequences generated according to the 
teachings of the present invention. 

FIG. 3 illustrates a remote configuration of the system described in 
25 Figure 2. 

FIGs. 4a-k are sequence alignments of overlapping regions of selected 
naturally occurring antisense and sense sequence pairs identified according to 
the teachings of the present invention. 
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FIGs. 5a-g are sequence alignments of overlapping regions of selected 
naturally occurring antisense and sense sequence pairs identified according to 
the teachings of the present invention. 

FIG. 6 schematically illustrates two transcription products of 53BP1 
5 gene (red and green) and their corresponding partial complementary antisense 
transcripts of the 76p gene (blue). Numbers in parenthesis indicate length of 
sequence complementation. Schematic location of strand-specific RNA probes 
used for northern blotting of sense (53BP1, Riboprobe#l) and antisense (76p, 
Riboprobe#2) transcripts is shown. 

10 FIG. 7 is an autoradiogram of a northern blot analysis depicting cellular 

distribution and expression levels of 53BP1 transcripts. Arrows on the right 
indicate the molecular weight of the identified 53BP1 transcripts relative to the 
migration of 28S and 18S ribosomal RNA subunits. [Numbers on the left 
denote the size of molecular weight markers in Kb. 

15 FIG. 8 is an autoradiogram of a northern blot analysis depicting cellular 

distribution and expression levels of 76p transcripts. Arrows on the right 
indicate the molecular weight of the identified 76p transcripts relative to the 
migration of 28S and 18S ribosomal RNA subunits. (Numbers on the left 
denote the size of molecular weight markers in Kb. 

20 FIG. 9 is an autoradiogram of a northern blot analysis depicting tissue 

distribution and expression levels of 76p transcripts. Arrows on the right 
indicate the molecular weight of the identified 76p transcripts. Numbers on the 
left denote the migration of molecular weight marker in Kb. 

FIG. 10 illustrates the genomic organization of the 53BP1 gene and 76p 

25 gene, as elucidated from the RT-PCR analysis presented in the Examples 
section hereinbelow. Black arrows indicate the location of the primers used for 
RT-PCR analysis. Asterisks denote stop codons. 

FIG. 1 1 schematically illustrates two transcription products of CIDE-B 
gene and their corresponding partial complementary antisense transcript of the 

30 BLTR2 gene. Schematic location of the strand-specific 430 nucleotide RNA 
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probe used for northern analysis of sense (CIDE-B) and antisense (BLTR2) 
transcripts is shown. Dashed rectangles indicate the predicted coding sequence 
of the transcripts. 

FIG. 12 is an autoradiogram of a northern blot analysis depicting cellular 
5 distribution and expression levels of BLTR2 transcripts. Arrows on the right 
indicate the molecular weight of the identified BLTR2 transcripts relative to the 
migration of 28S and 18S ribosomal RNA subunits. Numbers on the left 
denote the size of molecular weight markers in Kb, 

FIG. 13 shows autoradiogram of a northern blot analysis depicting 
10 cellular distribution and expression levels of CIDE-B transcripts. Arrows on 
the right indicate the molecular weight of the identified CIDE-B transcripts 
relatively to the migration of 28S and 18S ribosomal RNA subunits. Numbers 
on the left denote the migration size of molecular weight markers in Kb. 

FIG. 14 schematically illustrates a transcription product of APAF-1 gene 
15 and its corresponding partial complementary antisense transcripts of the EB-1 
gene. Schematic location of the strand-specific 366 nucleotide RNA probe 
used for northern analysis of sense (APAF-1) and antisense (EB-1) transcripts 
is shown. Asterisks indicate the predicted coding sequence borders of the 
transcripts. 

20 FIGs. 15a-b are autoradiograms of northern blot analyses depicting 

cellular distribution and expression levels of EB-1 (Figure 15a) and APAF-1 
transcripts (Figure 15b). Numbers on the left denote the size of molecular 
weight marker in Kb. 

FIG. 16 schematically illustrates a transcription product of the MINK-2 

25 gene and its corresponding partial complementary antisense transcript of the 
AchR-s gene. Schematic location of the strand-specific 280 nucleotide RNA 
probe used for northern analysis of sense (Mink-2) and antisense (AchR-e) 
transcripts is shown. 

FIGs. 17a-b are autoradiograms of northern blot analyses depicting 

30 cellular distribution and expression levels of AchR-e antisense transcripts 
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(Figure 17a) and the sense complementary transcript of Mink-2 (Figure 17b). 
Arrows on the right denote the migration of molecular weight markers in Kb. 

FIG. 18 schematically illustrates a transcription product of Cyclin-E2 
gene and its corresponding partial complementary antisense transcript. 
5 Schematic location of strand-specific RNA probes used for northern blotting of 
sense (Riboprobe#l) and antisense (Riboprobe#2) transcripts is shown. 

FIGs. 19a-b are autoradiograms of northern blot analyses depicting 
cellular distribution and expression levels of Cyclin E2 antisense transcript 
(Figure 19a) and the sense complementary transcript (Figure 19b), Arrows on 
10 the left denote the migration of molecular weight markers in Kb. 

FIG. 20 illustrates results from RT-PCR analysis of the expression 
patterns of CIDE-B transcript and its complementary naturally occurring 
antisense transcript following concentration dependent induction of apoptosis. 
Lanes: (1) 50 \xM etoposide; (2) 100 |iM etoposide; (3) 250 pM etoposide; (4) 
15 500 pM etoposide; (5) 10 nM staurosporine; (6) 100 nM staurosporine; (7) 250 
nM staurosporine; (8) 1000 nM staurosporine; (9) untreated cells (UT). 

FIGs. 21a-c are results of RT-PCR analyses depicting expression 
patterns of AchRe and its naturally occurring antisense transcript following 
time-dependent induction of differentiation. Figure 21a illustrates the position 
20 of riboprobes used for reverse transcription reaction. Figure 21b shows the 
reciprocal expression pattern of sense and antisense transcripts (indicated by 
arrows). Figure 21c shows the expression pattern of the antisense transcript 
alone. 

FIGs. 22a-j illustrate results of northern blot analysis of sense/antisense 
25 clusters revealing positive signals for sense/antisense genes in the microarray 
analysis. Diagrams describing genomic organization of the relevant region for 
each of the sense/antisense clusters are included above the autoradiograms, and 
regions of overlap (including GenBank accession number) from which the 
strand-specific riboprobes were derived are included. Sense-antisense pair 
30 numbers are as they appear in the microarray (as depicted in Table S2 on the 
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attached CD-ROM3 and in conversion Table 6). Figure 22a reveals expression 
patterns of randomly selected sequence pair number 235, denoted as Rand_235 
in Table 6. Similarly, Figure 22b corresponds to pair number 173, Figure 22c 
to pair number 248, Figure 22d to pair number 6, Figure 22e to pair number 

5 216, Figure 22f to pair number 239, Figure 22g to pair number 202, Figure 22h 
to pair number 114, Figure 22i to pair number 188, and Figure 22j to pair 
number 223. Eight pairs (Figures 22a-h) evaluated revealed positive signals for 
both sense and antisense expression, while two (Figures 22i~j) revealed a 
positive signal for only one of the genes, with the counterpart being a known 

10 RefSeq mRNA. 

FIG. 23 is a Table depicting expression patterns in various cell lines and 
tissues as probed with a subset of 264 pairs from the putative sense/antisense 
dataset of the present invention. The pairs are denoted by the pair number and 
described in Table SI of CD-ROM3. "C" and "AC" denote the two counterpart 

15 probes. Expression was also verified for positive controls, including the 
ubiquitously expressed genes gapdh, actin, hsp70 and gnblll in various 
concentrations, and 11 previously documented sense/antisense pairs. 
Expression thresholds were verified and indicated as if the probe passed 
the threshold in at least one cell line or tissue or if the probe did not pass 

20 the threshold in all experiments. In cases where both the sense and the 
antisense oligo passed the expression threshold, the antisense was declared 
'Verified". In cases where only one of the probes passed the expression 
threshold, but the other probe was fully contained within a known mRNA 
deposited in GenBank, the antisense was declared "indirectly verified". 

25 Normalization for microarray signals was conducted as described in the 
methods section. Rji ratios were obtained for each cell line/tissue assessed. 
Cases of flagged-out spots for which there was no information were marked 
1 .00". Data represent values of the two reciprocal experiments. 
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DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The present invention is of methods of identifying naturally occurring 
antisense transcripts, which can be used in kits and methods for quantifying 
gene expression levels. Specifically, the antisense molecules and related 
5 oligonucleotides generated according to information derived therefrom of the 
present invention can be used to detect, quantify, or specifically regulate 
antisense and respective sense transcripts thereby enabling detection and 
treatment of a wide range of disorders. 

The principles and operation of the present invention may be better 
lo understood with reference to the drawings and accompanying descriptions. 

Before explaining at least one embodiment of the invention in detail, it is 
to be understood that the invention is not limited in its application to the details 
of construction and the arrangement of the components set forth in the 
following description or illustrated in the drawings described in the Examples 
15 section. The invention is capable of other embodiments or of being practiced or 
carried out in various ways. Also, it is to be understood that the phraseology 
and terminology employed herein is for the purpose of description and should 
not be regarded as limiting. 

Terminology 

20 As used herein, the term "oligonucleotide" refers to a single stranded or 

double stranded oligomer or polymer of ribonucleic acid (RNA) or 
deoxyribonucleic acid (DNA) or mimetics thereof. This term includes 
oligonucleotides composed of naturally-occurring bases, sugars and covalent 
intemucleoside linkages (e.g., backbone) as well as oligonucleotides having 

25 non-naturally-occurring portions, which function similarly. Such modified or 
substituted oligonucleotides are often preferred over native forms because of 
desirable properties such as, for example, enhanced cellular uptake, enhanced 
affinity for nucleic acid target and increased stability in the presence of 
nucleases. 
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The term "antisense" refers to a complementary strand of an mRNA 

transcript e.g., antisense RNA* 

The phrase "naturally occurring antisense transcripts" refers to RNA 

transcripts encoded from an antisense strand of the DNA. These endogenous 
5 transcript exhibit at least partial complementarity to mRNA transcripts 

transcribed from the sense strand of a DNA, also termed sense transcripts, cis- 

encoded naturally occurring antisense transcripts are transcribed from the same 

locus as the sense transcripts. firans-encoded antisense transcripts are 

transcribed from a different locus than the respective sense transcripts. 
io The phrase "antisense strand" or "anticoding strand" refers to a strand of 

DNA, which serves as a template for mRNA transcription and as such is 

complementary to the mRNA transcript formed. 

The phrase "sense strand" or "coding strand" refers to the strand of 

DNA, which is identical to the mRNA transcript formed. 
1 5 The phrase "complementary DNA" (cDNA) refers to the double stranded 

or single stranded DNA molecule, which is synthesized from a messenger RNA 

template. 

The phrase "sense oriented polynucleotides" refers to polynucleotide 
sequences of a complementary or genomic DNA. Such polynucleotide 

20 sequences can be from exon regions, in which case they can encode mRNAs or 
portions thereof, or from intron regions, in which case they typically do not 
encode mRNA or portions thereof 

The term "contig" refers to a series of overlapping sequences with 
sufficient identity to create a longer contiguous sequence. 

25 The term "cluster" refers to a plurality of contigs all derived, with a high 

degree of probability, from a single gene. Clusters are generally formed based 
upon a specified degree of homology and overlap (e.g., a stringency). The 
different contigs in a cluster do not typically represent the entire sequence of 
the gene, rather the gene may comprise one or more unknown intervening 

30 sequences between the defined contigs. 
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The phrase "open reading frame" (ORP) refers to a nucleotide sequence, 
which could potentially be translated into a polypeptide. Such a stretch of 
sequence is uninterrupted by a stop codon. An ORF that represents the coding 
sequence for a full protein begins with an ATG "start" codon and terminates 
5 with one of the three "stop" codons. For the purposes of this application, an 
ORF may be any part of a coding sequence, with or without start and/or stop 
codons. For an ORF to be considered as a good candidate for coding for a bona 
fide cellular protein, a minimum size requirement is often set, for example, a 
stretch of DNA that would code for a protein of 50 amino acids or more. An 

io ORF is not usually considered an equivalent to a gene or locus until a 
phenotype is associated with a mutation in the ORF, an mRNA transcript for a 
gene product generated from the ORFs DNA has been detected, and/or the 
ORF's protein product has been identified. 

The term "annotation" refers to a functional or structural description of a 

15 sequence, which may include identifying attributes such as locus name, 
poly(A)/poly(T) tail and/or signal, key words, Medline references and 
orientation cloning data. 

Naturally occurring antisense molecules can play a role in sense 
transcription stability and function (e.g. translation). To date, most, if not all of 

20 the information relating to naturally occurring antisense transcripts was 
obtained by either low efficiency computational approaches (described 
hereinabove) or by approaches utilizing RNase protection assays, northern blot 
analysis, strand-specific RT PCR, subtractive hybridization, differential plaque 
hybridization, affinity chromatography, electrospray mass spectrometry and the 

25 like* These methods, though highly reliable, are extremely laborious, time 
consuming and are directed at individual target transcripts. As such, current 
approaches for uncovering antisense transcripts can be used to detect a 
negligible portion of the number of naturally occurring antisense molecules 
thought to exist. 
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As described hereinunder and in the Examples section, which follows, 
the present invention provides a novel approach for systematically identifying 
naturally occurring antisense molecules. 

Aside from large scale applicability, the present method can be used to 
5 identify naturally occurring antisense molecules even in cases where the 
antisense transcriptional unit is localized to an intron of an expressed gene or to 
a different locus than the complementary sense encoding gene (e.g., trans- 
encoded antisense), or in cases where the antisense molecule lacks an open 
reading frame or appreciable complementarity to known sense molecules. 
10 Antisense transcripts uncovered according to the teachings of the present 
invention can be used for detecting and accurately quantifying respective sense 
counterparts as well as for sensibly designing artificial antisense molecules 
suitable for down-regulation of sense counterparts. 

Thus, according to one aspect of the present invention there is provided 
1 5 a method of identifying putative naturally occurring antisense transcripts. 

The method according to this aspect of the present invention is effected 
by the following steps. 

First, sense-oriented polynucleotide sequences of a first database are 
computationally aligned with expressed polynucleotide sequences of a second 
20 database. 

Following computational alignment, expressed polynucleotide sequences 
are analyzed according to one or more criteria for their ability to hybridize or 
form a duplex or partial complementation with the sense-oriented 
polynucleotide sequences (further detailed hereinbelow and in the Examples 
25 section which follows). 

Expressed polynucleotide sequences which are capable of forming a 
duplex with sense oriented sequences are considered as putative naturally 
occurring antisense molecules and as such can be stored in a database which 
can be generated by a suitable computing platform. 
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Final confirmation of computationally obtained putative naturally 
occurring antisense molecules can be effected either computationally or 
preferably by using suitable laboratorial methodologies, based on nucleotide 
hybridization including RNase protection assay, subtractive hybridization, 
5 differential plaque hybridization, affinity chromatography, electrospray mass 
spectrometry, northern analysis, RT-PCR and the like (for further details see the 
Examples section). 

Information derived from the sequence, sense position and other 
structure characteristics of the naturally occurring antisense transcripts 
10 identified according to the teachings of the present invention can be used to 
quantify respective sense transcripts of interest or to generate corresponding 
artificial antisense polynucleotides, which can be packed in diagnostic or 
therapeutic kits and implemented in various therapeutic and diagnostic 
methods. 

15 Expressed polynucleotide sequences used as a potential source for 

identifying naturally occurring antisense transcripts according to this aspect of 
the present invention are preferably libraries of expressed messenger RNA [i.e., 
expressed sequence tags (EST), cDNA clones, contigs, pre-mRNA, etc.] 
obtained from tissue or cell-line preparations which can include genomic and/or 

20 cDNA sequence. 

Expressed polynucleotide sequences, according to this aspect of the 
present invention can be retrieved from pre-existing publicly available 
databases (i.e., GenBank database maintained by the National Center for 
Biotechnology Information (NCBI), part of the National Library of Medicine, 

25 and the TIGR database maintained by The Institute for Genomic Research) or 
private databases (i.e., the LifeSeq.™ and PathoSeq.™ databases available 
from Incyte Pharmaceuticals, Inc. of Palo Alto, CA). 

Alternatively, the sequence database of the expressed polynucleotide 
sequences utilized by the present invention can be generated from sequence 
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libraries (e.g., cDNA libraries, EST libraries, mRNA libraries and others). 
cDNA libraries are suitable sources for expressed sequence information. 

Generating a sequence database in such a case is typically effected by 
tissue or cell sample preparation, RNA isolation, cDNA library construction 
5 and sequencing. 

It will be appreciated that such cDNA libraries can be constructed from 
RNA isolated from whole organisms, tissues, tissue sections, or cell 
populations. Libraries can also be constructed from tissue reflecting a 
particular pathological or physiological state. Of particular interest are libraries 
10 constructed from sources associated with certain disease states, including 
malignant, neoplastic, hyperplastic tissues and the like. 

Once raw sequence data is obtained, sequences are selected and 
preferably annotated before stored in a database. Selection proceeds according 
to one or more sequence criterion, which will be further detailed hereinunder. 
15 The editing, annotation and selection process is divided into two stages of 
processing. One stage comprises removal of repetitive, redundant or non- 
informative and contaminant sequences. The second stage involves selection of 
suitable candidates of putative naturally occurring antisense sequences. 

The following section describes the different selection criteria which can 
20 be used for sequence filtering. 

Vector contamination - "chops" vector elements and linker motifs used 
for the process of cloning from desired expressed nucleotide sequences. This 
selection can be effected by screening manually updated databases of sequences 
included in commonly used expression or cloning vectors. 
25 Contaminating sequences - includes sequences which are derived from 

an undesired source. Such sequences can be recognized by their nucleotide 
distribution and/or by homology searches such as alignment searches using any 
sequence alignment algorithm such as BLAST (Basic Local Alignment Search 
Tool, available through www.ncbi.nlm.nih.gov/BLAST) or the Smith- 
30 Waterman algorithm. Other contaminating sequences may include sequences 
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exhibiting high occurrence of di-nucleotide distribution mostly related to 
sequencing artifacts and ribosomal RNA sequences. 

Repetitive elements and low complexity sequences - eliminates or 
masks expressed sequences comprising known repetitive elements (ALU, LI 
5 etc.) and low complexity sequences (i.e., a di- or tri-nucleotide repeat). Such 
elimination is preferably effected by comparison with database of known 
repetitive elements. It will be appreciated that this type of selection is mostly 
species specific. Masking of low complexity sequences can be effected by 
substituting an N (i.e., an inert character) for the actual nucleotide (i.e., G, A, T, 
io or C). Masking of low complexity sequences facilitates further computational 
analysis and maintains the spacing of the molecule. 

Sequence length - preferred expressed sequences are of a length 
between 20-2000, preferably 20-1000, more preferably 20-500, most preferably 
20-300 base pairs. 

15 Sequence annotation - expressed sequences retrieved from external 

databases, i.e., GenBank, oftentimes include an annotation which indicates 
direction of the sequencing of the insert clone (i.e., 5' or 3' direction). Sequence 
annotation, though "noisy" by nature due to multiple entries from various 
sources; artifacts taking place during directional cloning and incidence of 

20 palindromic eight-cutter restriction sites located at the end of the sequence, can 
serve as an important tool for deducing strand identity using dedicated 
computer software which are further discussed hereinunder 

Intton splice site consensus sequence intron splice site sharing- intron 
sequences nearly always begin with a di-nucleotide sequence of GT ("splice 

25 donor") and end with an AG ("splice acceptor") preceded by a pyrimidine-rich 
tract. This consensus sequence is part of the signal for splicing. Intron splice 
site consensus sequence on the complementary strand (e.g., antisense strand) 
begins with CT and ends with AC. Thus, combined with genomic data, 
expressed sequences having a GT...AG can be considered as sense-oriented 

30 sequences, while a CT...AC pattern is considered as an antisense oriented 



BNSDOCID: <WO. 03046220A1.J .> 



WO (13/046220 PCT/IL02/00904 

26 

sequence. This selection criterion is very stringent since only negligible 
portions of introns have a CT...AC pattern. Sequences that share a similar 
splicing pattern, as deduced by alignment to genomic data, may be considered 
as having the same sense orientation, also termed herein as "intron sharing". It 
5 will be appreciated by one skilled in the art that using these selection criteria 
requires a careful and accurate alignment of expressed sequences to genomic 
sequence. 

Poly(A) tails and Poly(T) heads - most eukaryotic rnRNA molecules 
contain a poly-adenylation (jpoly(A)] tail at their 3' end. This poly(A) tail is not 

10 encoded by DNA. Therefore an expressed sequence which has a poly (A) tail 
can be considered as sense oriented. Similarly, poly(T) heads, which are not 
encoded from a genomic sequence indicate that a sequence is of the opposite 
direction, namely antisense oriented. Notably, genomically encoded Poly(A) 
tails and poly(T) heads provide no information as to the sequence orientation. 

15 Poly (A) signal - some mature mRNA transcripts contain internal 

AAUAAA sequence. This internal sequence is part of an endonuclease 
cleavage signal. Following cleavage by the endonuclease, a poly(A) 
polymerase adds about 250 A residues to the 3' end of the transcript. Hence, 
expressed sequences containing a poly(A) signal can be considered as sense 

20 oriented. 

Rare restriction site used for cloning- for example, eight cutter 
endonucleases which cleave 8-mer palindromic sequences and are characterized 
by a low frequency of cutting often used in genome mapping and EST library 
preparations (e.g., NotL Commercially available from Promega: 
25 www.promega.com). Therefore, when a cluster of overlapping expressed 
sequences is characterized by a portion of sequences starting with a digestion 
site and another portion ending with the same, these sequences may be 
considered as encoded from the same strand. However, any endonuclease 
capable of digesting a palindromic sequence (i.e., Xhol, Sail, Pad etc.) may 
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also affect distorted sequence clustering, therefore strand orientation is 
preferably effected using other parameters as well. 

Sequence overlap - sequences that completely overlap are considered to 
have the same strand orientation. 
5 The above described parameters are used individually or in combination 

to analyze the expressed polynucleotide sequences so as to select anti-sense 
oriented sequences. 

Selection can be effected on the basis of a single criterion or several 
criteria considered individually or in combination. 
io In cases where several criteria are examined, a scoring system e.g., a 

scoring matrix, is preferably used. 

Since in some cases identifying an intron splicing consensus site may be 
more important than both sequence annotation and NotI alignment, while in 
others, detection of poly(A) tails and poly(T) heads might be the most 
15 significant criterion, the use of a scoring matrix in which each criterion is 
weighted enables one to select qualified antisense transcripts. 

Such a scoring matrix can list the various expressed polynucleotide 
sequences across the X-axis of the matrix while each criterion can be listed on 
the Y-axis of the matrix. Criteria include both a predetermined range of values 
20 from which a single value is selected from each sequence, and a weight. Each 
sequence is scored at each criterion according to its value and the weight of the 
criterion. 

When using such a scoring matrix the scores of each criterion of a 
specific sequence are summed and the results are analyzed. 
25 Expressed sequences which exhibit a total score greater than a particular 

stringency threshold are grouped as members of either a sense-oriented 
sequence set or antisense-oriented sequence set; the higher the score the more 
stringent the criteria of grouping. 

It will be appreciated that the above described analysis can take place 
30 prior to computational alignment to sense oriented sequences, i.e., during the 
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process of editing the expressed sequence database which is described 
hereinabove. Alternatively, selection can take place following computational 
alignment, thus further facilitating identification of proper duplex fonnation 
between the sense oriented polynucleotide sequences and expressed 
5 polynucleotide sequences. 

Genomic DNA or a portion thereof is preferably used as sense-oriented 
sequence data according to this aspect of the present invention. It is 
conceivable that the present invention can determine sense orientation and 
antisense orientation of a database of expressed sequences simply by 

io computationally aligning the sequences of the expressed database onto the 
genome, and finding whether two complementary expressed sequences 
hybridize to the genome (e.g., virtually generate a double stranded portion 
thereof). Such two overlapping sequences constitute sense and naturally 
occurring antisense transcripts. 

15 Utilizing genomic DNA as a sense oriented template is preferred for the 

following reasons: (i) identifying trans-encoded antisense transcripts; (ii) 
analyzing intron splice consensus site and intron sharing; (iii) omitting 
genomically encoded poly(A) and poly(T) sequences; and (iv) analyzing 
sequences encompassing eight-cutter restriction sites. 
20 Computational alignment of expressed polynucleotide sequences to the 

sense-oriented polynucleotide sequences (e.g., genomic sense sequences) can 
be effected using any commercially available alignment software, including 
sequence alignment tools utilizing algorithm such as BLAST (Basic Local 
Alignment Search Tool, available through www.ncbi.nlm.nih.gov/BLAST) or 
25 Smith-Waterman. 

Assembly software is preferably used according to this aspect of the 
present invention. Such software is of high value when complete genomic 
information is unavailable or when handling large amounts of expressed 
sequence data. A number of commonly used computer software fragment read 
30 assemblers capable of forming clusters of expressed sequences are now 
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available. These packages include but are not limited to, The TIGR Assembler 
[Sutton G. et al. (1995) Genome Science and Technology 1:9-19], GAP 
[Bonfield JK. et ah (1995) Nucleic Acids Res, 23:4992-4999], CAP2 [Huang 
X. et al. (1996) Genomics 33:21-31], The Genome Construction Manager 
5 [Laurence CB. Et al. (1994) Genomics 23:192-201], Bio Image Sequence 
Assembly Manager, SeqMan [Swindell SR. and Plasterer JN, (1997) Methods 
Mol. Biol. 70:75-89], LEADS and GenCarta (Compugen Ltd. Israel). 

Computer assembly and alignment programs can be modified to 
incorporate sequence criteria for determining sense or antisense orientation of 

10 expressed nucleotide sequences, as described hereinabove. Thereby, avoiding 
deliberate inversion of sequences during the assembly process, while ignoring 
the natural orientation of the sequences (i.e., sense or antisense orientation). 
Figure 1 illustrates results of expressed sequence assembly against genomic 
data and final distinction between sense oriented transcripts and antisense 

15 oriented transcripts of a single gene. 

Following a proper alignment of expressed sequences to sense oriented 
polynucleotide sequences, duplexes are identified. The term "duplex" is used 
herein to indicate that a sequence identified according to this aspect of the 
present invention is complementary to a sense-oriented polynucleotide 

20 sequence. Complementation may be to a portion of the sense sequence, i.e., a 
region thereof, or alternatively, to two or more non-contiguous regions, which 
may be separated by one or more nucleotides on the sense strand. 

The formation of sense-antisense duplexes does not require 100 % 
complementation nor does it require participation of the entire sense/antisense 

25 transcript sequence. The sense or antisense transcripts can have a secondary 
structure (e.g., stem and loop) generated by intra-sequence hybridization which 
can prevent specific sequence regions in the sense or antisense transcripts from 
participating in duplex formation. Thus, the antisense of the sequence 
identified, according to this aspect of the present invention can be 
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complementary to its sense counterparts in several regions, which are not 
necessarily close to each other when the sense transcript is in linear form. 

Although any length of sequence overlap can generate a duplex, overlaps 
of at least 5 preferably 20 more preferably 30 even more preferably 40 bp are 

5 considered more indicative of true sense-antisense duplex formation. 

The method of uncovering putative antisense transcripts of the present 
invention is preferably carried out using a dedicated computational system. 

Thus, according to another aspect of the present invention and as 
illustrated in Figure 2, there is provided a system for generating a database of 

10 putative naturally occurring antisense sequences which system is referred to 
hereinunder as system 10. 

System 10 includes a processing unit 12, which executes a software 
application designed and configured for aligning sense oriented polynucleotide 
sequences with expressed polynucleotide sequences and identifying expressed 

1 5 polynucleotide sequences which are capable of forming a duplex with the sense 
oriented polynucleotide sequences, thereby recognizing putative naturally 
occurring antisense transcripts. System 10 may also include a user input 
interface 14 (e.g., a keyboard and/or a mouse) for inputting database or 
database related information, and a user output interface 16 (e.g., a monitor) for 

20 providing database information to a user. 

System 10 preferably stores sequence information of the putative 
antisense transcripts identified thereby on a computer readable media such as a 
magnetic, optico-magnetic or optical disk to thereby generate a database of 
putative antisense transcript sequences. Such a database further includes 

25 information pertaining to database generation (e.g., source library), parameters 
used for selecting polynucleotide sequences, putative uses of the stored 
sequences, and various other annotations and references which relate to the 
stored sequences or respective sense transcripts. 
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System 10 of the present invention may be used by a user to query the 
stored database of sequences, to retrieve nucleotide sequences stored therein or 
to generate polynucleotide sequences from user inputted sequences. 

System 10 can be any computing platform known in the art including 
5 but not limited to, a personal computer, a work station, a mainframe and the 
like. 

The database generated and stored by system 10 can be accessed by an 
on-site user of system 10, or by a remote user communicating with system 10. 

As illustrated in Figure 3, communication between a remote user 18 and 
10 processing unit 12 is preferably effected via a communication network 20. 
Communication network 20 can be any private or public communication 
network including, but not limited to, a standard or cellular telephony network, 
a computer network such as the Internet or intranet, a satellite network or any 
combination thereof. 

15 As illustrated in Figure 3, communication network 20 includes one or 

more communication servers 22 (one shown in Figure 3) which serves for 
communicating data pertaining to the polypeptide of interest between remote 
user 18 and processing unit 12. 

It will be appreciated that existing computer networks such as the 

20 Internet can provide the infrastructure and technology necessary for supporting 
data communication between any number of sites 24 and remote analysis sites 
26, 

For example, using a computer operating a Web browser application and 
the World Wide Web, any expressed polynucleotide sequence of interest can be 
25 "uploaded" by user 18 onto a Web site maintained by a database server 28. 
Following uploading, database server 28 which serves as processing unit 12 can 
be instructed by the user to processes the polynucleotide as is described 
hereinabove. 
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Following such processing, which can be performed in real time, nucleic 
acid sequence results can be displayed at the web site maintained by database 
server 28 and/or communicated back to site 24, via for example, e-mail 
communication. 

5 Thus, using the Internet, a remote configuration of system 10 can 

provide polynucleotide sequence analysis services to a plurality of sites 24 (one 
shown in Figure 3). 

It will be appreciated that this configuration of system 10 of the present 
invention is especially advantageous in cases where polypeptide analysis can 

10 not be effected on-site. For example, laboratories, which lack the equipment 
necessary for executing the analysis or lack the necessary skills to operate it. 

Thus, data extracted from the database of naturally occurring antisense 
transcripts of the present invention is of high value for designing 
oligonucleotides suitable for transcript detection and quantification and for 

15 sensibly designing artificial antisense oligonucleotides for down-regulation and 
elimination of a transcript of interest or changing the balance between sense 
and complementary antisense transcripts. The possibility of up-regulating a 
transcript of interest using naturally occurring antisense based-oligonucleotides 
generated according to the teachings of the present invention is also realized. 

20 In addition, data extracted from the database of naturally occurring antisense 
transcripts may also be used for assessing endogenous double stranded-RNA 
also termed interfering RNA, which may distort gene-expression due to either 
RNA-degradation, DNA-methylation, polycomb mediated suppression etc. (for 
details see the Background section hereinabove). 

25 Antisense technology is based upon the pairing of an artificially 

designed antisense oligonucleotide, with a target nucleic acid. The use of 
antisense technology requires a complementarity of the antisense nucleotide 
sequence to a target zone of an mRNA target sequence that will effect 
inhibition of gene expression [reviewed in Stein CA. and Cohen JS. (1988) 

30 Cancer Res. 48:2659-68]. Based on empiric experience it was shown that the 
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success of antisense technology relies on: (i) cellular uptake; (ii) stability of 
artificial antisense molecules under physiological conditions (i.e., cellular pH, 
endonucleases etc.); (iii) complementation between the oligonucleotide and a 
single stranded target sequence (i.e., tertiary structure of target RNA will not 
5 form a good target); (iv) binding specificity of antisense oligonucleotide so as 
not to compete with other RNA binders (e.g. proteins) to thereby maintain an 
effective antisense concentration. 

Various attempts to employ antisense technology while considering the 
above discussed limitations included using large amounts of oligonucleotides to 

30 overcome cellular uptake and environmental barriers and chemically modified 
antisense nucleotide compositions, for obtaining higher level of cellular 
stability. However, even in case where uptake difficulties are traversed, the 
step of target identification (i.e., RNA-target sequence region) continues to be 
the major bottleneck for successful implementation of antisense technology. 

15 U.S. Pat. No: 6,183,966 discloses a method and an apparatus for ranking 

nucleic acid sequences based on stability of nucleic acid oligomer sequence 
binding interactions to select sequence zones for antisense targeting. This 
method however systematic, relies on thermodynamic analyses combined with 
numerous predictions which cannot be considered empirically accurate and 

20 reliable. 

Thus according to another aspect of the present invention there is 
provided a method of designing artificial antisense transcripts. 

The method according to this aspect of the present invention is effected 
by the following steps. 
25 First, structural and/or functional parameters pertaining to naturally 

occurring antisense transcripts are extracted/deduced from a database such as 
the one described hereinabove. These parameters may be generally deduced 
from all sequences stored in the database, or extracted from specific antisense 
sequences or preferably groups of antisense sequences. 
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Second, artificial antisense molecules of interest are designed according 
to the extracted parameters. 

Such parameters may be divided into three groups, topographical 
parameters, functional parameters and structural parameters. 

Topographical parameters - (i) position of sequence overlap on the 
sense transcript (i.e., coding region, 5TJTR, 3 f UTR); (ii) position of the 
sequence overlap on the antisense transcript (end overlap, middle overlap, full 
overlap), (iii) length of overall sequence overlap; (iv) continuity or 
discontinuity of sequence overlap. 

Structural parameters - pertains to both sense and antisense transcripts 
(i) tertiary structure (i.e., hairpin, helix, stem and loop, pseudoknot, and the 
like); (ii) single stranded versus double stranded regions; (iii) GC content; (iv) 
tandem Gs; (v) adenosine/inosine content; (vi) thermodynamic stability of 
tertiary structures; (vii) duplex melting point; (viii) methylations and other 
RNA modifications; (ix) RNA-protein interactions ; and (x) transcript length. 

Functional parameters - (i) alternative splicing; (ii) tissue expression; 
(iii) pathology specific expression; (iv) antisense promoters; (v) intron content; 
(vi) open reading frame in antisense transcript. 

These parameters can be used individually or in combination, in which 
case, each parameter is preferably weighted according to its importance. Due to 
the multi-factorial design of artificial antisense transcripts according to this 
aspect of the present invention, employing a scoring system (described 
hereinabove) is preferably used to simplify and increase the accuracy of the 
process. 

Synthetic antisense oligonucleotides designed according to the teachings 
of the present invention can be generated according to any oligonucleotide 
synthesis method known in the art such as enzymatic synthesis or solid phase 
synthesis. Equipment and reagents for executing solid-phase synthesis are 
commercially available from, for example, Applied Biosystems. Any other 
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means for such synthesis may also be employed; the actual synthesis of the 
oligonucleotides is well within the capabilities of one skilled in the art. 

Oligonucleotides used according to this aspect of the present invention 
are those having a length selected from a range of 10 to about 200 bases 
5 preferably 15-150 bases, more preferably 20-100 bases, most preferably 20-50 
bases. 

The oligonucleotides of the present invention may comprise heterocylic 
nucleosides consisting of purines and the pyrimidines bases, bonded in a 3* to 5' 
phosphodiester linkage. 

io Preferably used oligonucleotides are those modified in either backbone, 

internucleoside linkages or bases, as is broadly described hereinunder. Such 
modifications can oftentimes facilitate oligonucleotide uptake and resistance to 
intracellular conditions. 

Specific examples of preferred oligonucleotides useful according to this 

15 aspect of the present invention include oligonucleotides containing modified 
backbones or non-natural internucleoside linkages. Oligonucleotides having 
modified backbones include those that retain a phosphorus atom in the 
backbone, as disclosed in U.S. Pat. NOs: ,687,808; 4,469,863; 4,476,301; 
5,023,243; 5,177,196; 5,188,897; 5,264,423; 5,276,019; 5,278,302; 5,286,717; 

20 5,321,131; 5,399,676; 5,405,939; 5,453,496; 5,455,233; 5,466, 677; 5,476,925; 
5,519,126; 5,536,821; 5,541,306; 5,550,111; 5,563,253; 5,571,799; 5,587,361; 
and 5,625,050. 

Preferred modified oligonucleotide backbones include, for example, 
phosphorothioates, chiral phosphorothioates, phosphorodithioates, 

25 phosphotriesters, aminoalkyl phosphotriesters, methyl and other alkyl 
phosphonates including 3 f -alkylene phosphonates and chiral phosphonates, 
phosphinates, phosphoramidates including 3'-amino phosphoramidate and 
aminoalkylphosphoramidates, thionophosphoramidates, 
thionoalkylphosphonates, thionoalkylphosphotriesters, and boranophosphates 

30 having normal 3-5 ! linkages, 2 , -5 l linked analogs of these, and those having 
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inverted polarity wherein the adjacent pairs of nucleoside units are linked 3'-5 r 
to 5'-3 f or 2 , -5' to 5 f -2\ Various salts, mixed salts and free acid forms can also 
be used. 

Alternatively, modified oligonucleotide backbones that do not include a 

5 phosphorus atom therein have backbones that are formed by short chain alkyl or 
cycloalkyl internucleoside linkages, mixed heteroatom and alkyl or cycloalkyl 
internucleoside linkages, or one or more short chain heteroatomic or 
heterocyclic internucleoside linkages. These include those having morpholino 
linkages (formed in part from the sugar portion of a nucleoside); siloxane 

jo backbones; sulfide, sulfoxide and sulfone backbones; formacetyl and 
thioformacetyl backbones; methylene formacetyl and thioformacetyl 
backbones; alkene containing backbones; sulfamate backbones; 
methyleneimino and methylenehydrazino backbones; sulfonate and sulfonamide 
backbones; amide backbones; and others having mixed N, O, S and CH2 

15 component parts, as disclosed in U.S. Pat. Nos. 5,034,506; 5,166,315; 
5,185,444; 5,214,134; 5,216,141; 5,235,033; 5,264,562; 5,264,564; 5,405,938; 
5,434,257; 5,466,677; 5,470,967; 5,489,677; 5,541,307; 5,561,225; 5,596,086; 
5,602,240; 5,610,289; 5,602,240; 5,608,046; 5,610,289; 5,618,704; 5,623, 070; 
5,663,312; 5,633,360; 5,677,437; and 5,677,439. 

20 Other oligonucleotides which can be used according to the present 

invention, are those modified in both sugar and the internucleoside linkage, i.e., 
the backbone, of the nucleotide units are replaced with novel groups. The base 
units are maintained for complementation with the appropriate polynucleotide 
target. An example for such an oligonucleotide mimetic, includes peptide 

25 nucleic acid (PNA). A PNA oligonucleotide refers to an oligonucleotide where 
the sugar-backbone is replaced with an amide containing backbone, in 
particular an aminoethylglycine backbone. The bases are retained and are bound 
directly or indirectly to aza nitrogen atoms of the amide portion of the 
backbone. United States patents that teach the preparation of PNA compounds 

30 include, but are not limited to, U.S. Pat Nos. 5,539,082; 5,714,331; and 
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5,719,262, each of which is herein incorporated by reference. Other backbone 
modifications, which can be used in the present invention are disclosed in U.S. 
Pat. No: 6,303,374. Oligonucleotides of the present invention may also include 
base modifications or substitutions. As used herein, "unmodified" or "natural" 
bases include the purine bases adenine (A) and guanine (G), and the pyrimidine 
bases thymine (T), cytosine (C) and uracil (U). Modified bases include but are 
not limited to other synthetic and natural bases such as 5-methylcytosine (5-me- 
C), 5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine, 6- 
methyl and other alkyl derivatives of adenine and guanine, 2-propyl and other 
alkyl derivatives of adenine and guanine, 2-thiouracil, 2-thiothymine and 2- 
thiocytosine, 5-halouracil and cytosine, 5-propynyl uracil and cytosine, 6-azo 
uracil, cytosine and thymine, 5 -uracil (pseudouracil), 4-thiouracil, 8-halo, 8- 
amino, 8-thiol, 8-thioalkyl, 8-hydroxyl and other 8-substituted adenines and 
guanines, 5-halo particularly 5-bromo, 5-trifluoromethyl and other 5-substituted 
uracils and cytosines, 7-methylguanine and 7-methyladenine, 8-azaguanine and 
8-azaadenine, 7-deazaguanine and 7-deazaadenine and 3-deazaguanine and 3- 
deazaadenine. Further bases include those disclosed in U.S. Pat. No: 
3,687,808, those disclosed in The Concise Encyclopedia Of Polymer Science 
And Engineering, pages 858-859, Kroschwitz, J. L, ed. John Wiley & Sons, 
1990, those disclosed by Englisch et al., Angewandte Chemie, International 
Edition, 1991, 30, 613, and those disclosed by Sanghvi, Y. S., Chapter 15, 
Antisense Research and Applications, pages 289-302, Crooke, S. T. and 
Lebleu, B. , ed., CRC Press, 1993- Such bases are particularly useful for 
increasing the binding affinity of the oligomeric compounds of the invention. 
These include 5-substituted pyrimidines, 6-azapyrimidines and N-2, N-6 and O- 
6 substituted purines, including 2-aminopropyladenine, 5-propynyluracil and 5- 
propynylcytosine. 5-methylcytosine substitutions have been shown to increase 
nucleic acid duplex stability by 0.6-1.2 °C. [Sanghvi YS et al. (1993) Antisense 
Research and Applications, CRC Press, Boca Raton 276-278] and are presently 
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preferred base substitutions, even more particularly when combined with 2'-0- 
methoxyethyl sugar modifications. 

Another modification of the oligonucleotides of the invention involves 
chemically linking to the oligonucleotide one or more moieties or conjugates, 
5 which enhance the activity, cellular distribution or cellular uptake of the 
oligonucleotide. Such moieties include but are not limited to lipid moieties 
such as a cholesterol moiety, cholic acid, a thioether, e.g., hexyl~S~tritylthioI, a 
thiocholesterol, an aliphatic chain, e.g., dodecandiol or undecyl residues, a 
phospholipid, e.g., di-hexadecyl-rac~glycerol or triethylammonium 1,2-di-O- 
10 hexadecyl-rac-glycero-3-H-phosphonate, a polyamine or a polyethylene glycol 
chain, or adamantane acetic acid, a palmityl moiety, or an octadecylamine or 
hexylamino-carbonyl-oxycholesterol moiety, as disclosed in U.S. Pat. No: 
6,303,374. 

It is not necessary for all positions in a given oligonucleotide molecule 

15 to be uniformly modified, and in fact more than one of the aforementioned 
modifications may be incorporated in a single compound or even at a single 
nucleoside within an oligonucleotide. 

The present invention also includes antisense molecules, which are 
chimeric molecules. "Chimeric" antisense molecules", are oligonucleotides, 

20 which contain two or more chemically distinct regions, each made up of at least 
one nucleotide. These oligonucleotides typically contain at least one region 
wherein the oligonucleotide is modified so as to confer upon the 
oligonucleotide increased resistance to nuclease degradation, increased cellular 
uptake, and/or increased binding affinity for the target polynucleotide. An 

25 additional region of the oligonucleotide may serve as a substrate for en2ymes 
capable of cleaving RNA:DNA or RNA:RNA hybrids. An example for such 
include RNase H, which is a cellular endonuclease which cleaves the RNA 
strand of an RNA:DNA duplex. Activation of RNase H, therefore, results in 
cleavage of the RNA target, thereby greatly enhancing the efficiency of 

30 oligonucleotide inhibition of gene expression. Consequently, comparable 
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results can often be obtained with shorter oligonucleotides when chimeric 
oligonucleotides are used, compared to phosphorothioate 
deoxyoiigonucleotides hybridizing to the same target region. Cleavage of the 
RNA target can be routinely detected by gel electrophoresis and, if necessary, 
5 associated nucleic acid hybridization techniques known in the art. 

Chimeric antisense molecules of the present invention may be formed as 
composite structures of two or more oligonucleotides, modified 
oligonucleotides, as described above. Representative U.S. patents that teach the 
preparation of such hybrid structures include, but are not limited to, U.S. Pat. 

10 Nos. 5,013,830; 5,149,797; 5,220,007; 5,256,775; 5,366,878; 5,403,711; 
5,491,133; 5,565,350; 5,623,065; 5,652,355; 5,652,356; and 5,700,922, each of 
which is herein fully incorporated by reference. 

Finally, chimeric oligonucleotides of the present invention can comprise 
a ribozyme sequence. Ribozymes are being increasingly used for the sequence- 

15 specific inhibition of gene expression by the cleavage of mRNAs. Several 
ribozyme sequences can be fused to the oligonucleotides of the present 
invention. These sequences include but are not limited ANGIOZYME 
specifically inhibiting formation of the VEGF-R (Vascular Endothelial Growth 
Factor receptor), a key component in the angiogenesis pathway, and 

20 HEPTAZYME, a ribozyme designed to selectively destroy Hepatitis C Virus 
(HCV) RNA, (Ribozyme Pharmaceuticals, Incorporated - WEB home page). 

The oligonucleotides generated according to the teachings of the present 
invention can be used for both diagnostic and therapeutic purposes. For 
example, oligonucleotides of the present invention can be used to diagnose and 

25 treat a variety of diseases or pathological conditions associated with an 
abnormal expression (i.e., up-regulation or down-regulation) of at least one 
mRNA molecule of interest, including but not limited to diabetes, autoimmune 
diseases, Parkinson, Alzheimer' disease, HIV, malaria, cholera, influenza, 
rabies, diphtheria, breast cancer, colon cancer, cervical cancer, melanoma, lung 

30 cancer, ovarian cancer, pancreatic cancer, prostate cancer, lymphomas, 
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leukemias and the like and any other diseases (see Example 8 of the Examples 
section) which are associated with aberrant expression of multiple mRNAs (i.e., 
sense and/or antisense) or with unregulated formation of endogenous double 
stranded RNA complexes. 

5 Present-day mRNA-based diagnostic assays utilize oligonucleotide 

probes which are complementary to one or more regions of the mRNA to be 
quantitated. Such probes are designed while considering interspecies sequence 
variation, sequence length, GC content etc. However design of such prior art 
probes (i.e., riboprobes or deoxyriboprobes) does not take into consideration the 

10 presence of antisense transcripts which can effect probe binding efficiency. 
Discounting antisense presence can lead to inaccurate diagnosis, which is 
oftentimes followed by an erroneous treatment protocol. 

The present invention provides an mRNA-detection/quantification assay, 
which is devoid of this limitation. 

15 Thus, according to an additional aspect of the present invention there is 

provided a method of quantifying at least one mRNA of interest in a biological 
sample. 

As used herein, the phrase "biological sample" refers to any sample 

derived from biological tissues or fluids, including blood (serum or plasma), 
20 sputum, pleural effusions, urine, biopsy specimens, isolated cells and/or cell 

membrane preparation. Methods of obtaining tissue biopsies and body fluids 

from mammals are well known in the art. 

The method of this aspect of the present invention is effected by 

contacting mRNA from a cell type or within a cell with one or more 
25 oligonucleotides that hybridizes efficiently with a sequence region of an mRNA 

transcript which is not complementary with a naturally occurring antisense 

transcript. 

In addition to the limitation described above, prior art 
diagnostic/detection assays also fail to consider the effect of antisense 
30 transcription on the protein expression levels of a gene of interest. It stands to 
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reason that presence of antisense transcripts in a biological sample can 
substantially reduce the resultant protein levels translated from a 
complementary sense transcript. Consistently, diseases which are associated 
with endogenous dsKNA complexes, are also very difficult to detect and 
5 moreover to treat, due to insufficient sequence data pertaining to duplex 
forming transcripts. 

Thus, for accurate quantification of gene expression, both the sense and 
antisense levels must be quantified and/or their respective expression ratio must 
be determined. 

jo By contacting a biological sample with one or more pairs of 

oligonucleotides, where one oligonucleotide is capable of hybridizing with the 
mRNA of interest and the second oligonucleotide is capable of hybridizing with 
a naturally occurring antisense transcript which is complementary with the 
mRNA of interest such accurate quantification can be effected. 

is Contacting the oligonucleotides of the present invention with the 

biological sample is effected by stringent, moderate or mild hybridization (as 
used in any polynucleotide hybridization assay such as northern blot, dot blot, 
RNase protection assay, RT-PCR and the like). Wherein stringent 
hybridization is effected by a hybridization solution of 6 x SSC and 1 % SDS 

20 or 3 M TMACI, 0.0 1 M sodium phosphate (pH 6.8), 1 mM EDTA (pH 7.6), 
0.5 % SDS, 100 mg/ml denatured salmon sperm DNA and 0.1 % nonfat dried 
milk, hybridization temperature of 1 - 1.5 °C below the Tm, final wash solution 
of 3 M TMACI, 0.01 M sodium phosphate (pH 6.8), 1 mM EDTA (pH 7.6), 
0.5 % SDS at 1 - 1.5 °C below the Tm; moderate hybridization is effected by a 

25 hybridization solution of 6 x SSC and 0.1 % SDS or 3 M TMACI, 0.01 M 
sodium phosphate (pH 6.8), 1 mM EDTA (pH 7.6), 0.5 % SDS, 100 mg/ml 
denatured salmon sperm DNA and 0.1 % nonfat dried milk, hybridization 
temperature of 2 - 2.5 °C below the Tm, final wash solution of 3 M TMACI, 
0.01 M sodium phosphate (pH 6.8), 1 mM EDTA (pH 7.6), 0.5 % SDS at 1 - 

30 1.5 °C below the Tm, final wash solution of 6 x SSC, and final wash at 22 °C; 
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whereas mild hybridization is effected by a hybridization solution of 6 x SSC 
and 1 % SDS or 3 M TMACI, 0.01 M sodium phosphate (pH 6.8), 1 mM 
EDTA (pH 7.6), 0.5 % SDS, 100 mg/ml denatured salmon sperm DNA and 0.1 
% nonfat dried milk, hybridization temperature of 37 °C, final wash solution of 
5 6 x SSC and final wash at 22 °C. 

The oligonucleotides of the present invention can be attached to a solid 
substrate, which may consist of a particulate solid phase such as nylon filters, 
glass slides or silicon chips [Schena et al. (1995) Science 270:467-470]. 

In a particular embodiment, oligonucleotides of the present invention 
10 can be attached to a solid substrate, which is designed as a microarray. 
Microarrays are known in the art and consist of a surface to which probes that 
correspond in sequence to gene products (e.g., cDNAs, mRNAs, cRNAs, 
polypeptides, and fragments thereof), can be specifically hybridized or bound at 
a known position (regiospecificity). 
15 Several methods for attaching the oligonucleotides to a microarray are 

known in the art including but not limited to glass-printing, described generally 
by Schena et al., 1995, Science 270:467-47, photolithographic techniques 
[Fodor et al. (1991) Science 251:767-773], inkjet printing, masking and the 
like. 

20 In general, quantifying hybridization complexes is well known in the art 

and may be achieved by any one of several approaches. These approaches are 
generally based on the detection of a label or marker, such as any radioactive, 
fluorescent, biological or enzymatic tags or labels of standard use in the art. A 
label can be applied on either the oligonucleotide probes or nucleic acids 

25 derived from the biological sample. 

The following illustrates a number of labeling methods suitable for use 
in the present invention. For example, oligonucleotides of the present invention 
can be labeled subsequent to synthesis, by incorporating biotinylated dNTPs or 
rNTP, or some similar means (e.g., photo-cross-linking a psoralen derivative of 

30 biotin to RNAs), followed by addition of labeled streptavidin (e.g., 
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phycoerythrin-conjugated streptavidin) or the equivalent. Alternatively, when 
fluorescently-labeled oligonucleotide probes are used, fluorescein, lissamine, 
phycoerythrin, rhodamine (Perkin Elmer Cetus), Cy2, Cy3, Cy3.5, Cy5, Cy5.5, 
Cy7, FluorX (Amersham) and others [ e.g., Kricka et al. (1992), Academic 
5 Press San Diego, Calif] can be attached to the oligonucleotides. It will be 
appreciated that pairs of fluorophores are chosen when distinction between two 
emission spectra of two oligonucleotides is desired or optionally, a label other 
than a fluorescent label is used. For example, a radioactive label, or a pair of 
radioactive labels with distinct emission spectra, can be used [Zhao et al (1995) 
10 Gene 156:207]. However, because of scattering of radioactive particles, and 
the consequent requirement for widely spaced binding sites, the use of 
fluorophores rather than radioisotopes is more preferred. 

The intensity of signal produced in any of the detection methods 
described hereinabove may be analyzed manually or using a software 
15 application and hardware suited for such purposes. 

In general, mRNA quantification is preferably effected alongside a 
calibration curve so as to enable accurate mRNA determination. Furthermore, 
quantifying transcript(s) originating from a biological sample is preferably 
effected by comparison to a normal sample, which sample is characterized by 
20 normal expression pattern of the examined transcript(s). 

It will be appreciated that the detection method described above can also 
be used for quantifying at least one naturally occurring antisense transcript in a 
biological sample. In such a case, the oligonucleotide used for quantification is 
designed to hybridize with a sequence region of naturally occurring antisense 
25 transcript of interest, which is not complementary with a naturally occurring 
mRNA transcript. 

The diagnostic assays described hereinabove can be used to accurately 
distinguish between absence, presence and excess expression of any transcripts 
of interest (e.g., sense, antisense), and to monitor their level during therapeutic 
30 intervention. These methods are also capable of diagnosing diseases associated 
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with an improper balance or ratio between sense and antisense expression and 
diseases associated with endogenous dsRNA. 

Further description of oligonucleotide-pair arrays is provided in 
Example 9 of the Examples section which follows. 

5 As discussed hereinabove oligonucleotides of the present invention can 

be also used for therapeutic purposes, such as treating diseases or conditions 
associated with aberrant expression levels of one or more sense and/or 
antisense transcripts and conditions, which are associated with endogenous 
dsRNA such as unregulated formation of double-strand RNA (i.e., up/down- 

10 regulation). 

Accumulative knowledge shows strong correlation between a variety of 
human diseases and mutations, over-expression and function of the protein 
building blocks (i.e., protein kinases, phosphatases) and their effectors and 
regulators, which constitute numerous intracellular signaling pathways. For 

is instance, inactivation of both copies of ZAP-70 or Jak-3 causes severe 
combined immunodeficiency and mutation of the X-linked BTK gene results in 
agammaglobulinemia. Many genetic disorders are also associated with 
mutations for example, in protein-serine kinases (PSKs) and phosphatases. The 
Coffin-Lowry syndrome results from inactivation of the X-linked Rsk2 gene, 

20 and myotonic dystrophy is due to decreased levels of expression of the 
myotonic dystrophy PSK. In addition, over-expression of ErbB2 receptor 
tyrosine kinase is implicated in breast and ovarian carcinoma [reviewed by 
Hunter T. (2000) Cell 100:113-127] . 

Given the importance of activated kinases in a variety of disorders such 

25 as cancer, it would be anticipated that phosphatases regulation would be found 
as tumor suppressor genes and as promising drug targets. So far this has not 
proved to be the case. Furthermore, a number of diseases are associated with 
insufficient expression of signaling molecules, including non-insulin-dependent 
diabetes and peripheral neuropathies. 
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Thus, it is conceivable that identification of naturally occurring antisense 
transcripts of signaling molecules participating in specified signaling pathways 
may serve as promising tools for both identification and particularly treatment 
of a variety of disorders at any gene expression level (i.e., RNA, DNA or 
5 protein). 

The term "treating" refers to alleviating or diminishing a symptom 
associated with the disease or the condition. Preferably, treating cures, e.g., 
substantially eliminates, and/or substantially decreases, the symptoms 
associated with the diseases or conditions of the present invention. 
10 The treatment method according to the teachings of the present invention 

includes administering to an individual a therapeutically effective amount of the 
synthetic antisense oligonucleotides of the present invention. Preferred 
individual subjects according to the present invention are mammals such as 
canines, felines, ovines, porcines, equines, bovines, humans and the like, 
is A therapeutically effective amount implies an amount of agent effective 

to prevent, alleviate or ameliorate symptoms of disease or prolong the survival 
of the individual being treated 

The agent of the method of the present invention can be administered to 
an individual per se, or as part of a pharmaceutical composition where it is 
20 mixed with a pharmaceutical^ acceptable carrier. 

As used herein a "pharmaceutical composition" refers to a composition 
of one or more of the agents described hereinabove, or physiologically 
acceptable salts or prodrugs thereof, with other chemical components. The 
purpose of a pharmaceutical composition is to facilitate administration of a 
25 compound to an organism. 

The pharmaceutical compositions of the present invention may be 
administered in a number of ways depending upon whether local or systemic 
treatment is desired and upon the area to be treated. Administration may be 
topical (including ophthalmic and to mucous membranes including vaginal and 
30 rectal delivery), pulmonary, e.g., by inhalation or insufflation of powders or 
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aerosols, including by nebulizer; intratracheal, intranasal, epidermal and 
transdermal), oral or parenteral. Parenteral administration includes intravenous, 
intraarterial, subcutaneous, intraperitoneal or intramuscular injection or 
infusion; or intracranial, e.g., intrathecal or intraventricular, administration. 
5 Oligonucleotides with at least one 2 f -0-methoxyethyl modification are believed 
to be particularly useful for oral administration. 

Pharmaceutical compositions and formulations for topical administration 
may include transdermal patches, ointments, lotions, creams, gels, drops, 
suppositories, sprays, liquids and powders. Conventional pharmaceutical 
10 carriers, aqueous, powder or oily bases, thickeners and the like may be 
necessary or desirable. Coated condoms, gloves and the like may also be useful. 

Compositions and formulations for oral administration include powders 
or granules, suspensions or solutions in water or non-aqueous media, capsules, 
sachets or tablets. Thickeners, flavoring agents, diluents, emulsifiers, 
1 5 dispersing aids or binders may be desirable. 

Compositions and formulations for parenteral, intrathecal or 
intraventricular administration may include sterile aqueous solutions which may 
also contain buffers, diluents and other suitable additives such as, but not 
limited to, penetration enhancers, carrier compounds and other 
20 pharmaceutical^ acceptable carriers or excipients. 

Pharmaceutical compositions of the present invention include, but are 
not limited to, solutions, emulsions, and liposome-containing formulations. 
These compositions may be generated from a variety of components that 
include, but are not limited to, preformed liquids, self-emulsifying solids and 
25 self-emulsifying semisolids. 

The pharmaceutical formulations of the present invention, which may 
conveniently be presented in unit dosage form, may be prepared according to 
conventional techniques well known in the pharmaceutical industry. Such 
techniques include the step of bringing into association the active ingredients 
30 with the pharmaceutical carrier(s) or excipient(s). In general the formulations 
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are prepared by uniformly and intimately bringing into association the active 
ingredients with liquid carriers or finely divided solid carriers or both, and then, 
if necessary, shaping the product. 

The compositions of the present invention may be formulated into any of 
5 many possible dosage forms such as, but not limited to, tablets, capsules, liquid 
syrups, soft gels, suppositories, and enemas. The compositions of the present 
invention may also be formulated as suspensions in aqueous, non-aqueous or 
mixed media. Aqueous suspensions may further contain substances which 
increase the viscosity of the suspension including, for example, sodium 

lo carboxymethylcellulose, sorbitol and/or dextran. The suspension may also 
contain stabilizers. 

In one embodiment of the present invention the pharmaceutical 
compositions may be formulated and used as foams. Pharmaceutical foams 
include formulations such as, but not limited to, emulsions, microemulsions, 

15 creams, jellies and liposomes. While basically similar in nature these 
formulations vary in the components and the consistency of the final product. 
The preparation of such compositions and formulations is generally known to 
those skilled in the pharmaceutical and formulation arts and may be applied to 
the formulation of the compositions of the present invention. 

20 The pharmaceutical compositions of the present invention may employ 

various penetration enhancers to effect the efficient delivery of nucleic acids, 
particularly oligonucleotides, to the skin of animals. 

Penetration enhancers may be classified as belonging to one of five 
broad categories, i.e., surfactants, fatty acids, bile salts, chelating agents, and 

25 non-chelating non-surfactants [Lee et al., Critical Reviews in Therapeutic Drug 
Carrier Systems (1991) 92] as disclosed in U.S. Pat. No: 6,300,132, 6,271,030, 
6,277,633, 6,284,538, 6,287,860, 6,294,382, 6,277,640 and 6,258,601 each of 
which is herein fully incorporated by reference. 

Other substances that enhance uptake of oligonucleotides at the cellular 

30 level may also be added to the pharmaceutical compositions of the present 
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invention. For example, cationic lipids, such as lipofectin [U.S. Pat. No. 
5,705,188], cationic glycerol derivatives, and polycationic molecules, such as 
polylysine [PCT Application. WO 97/30731], are also known, to enhance the 
cellular uptake of oligonucleotides. 
5 Other reagents may be utilized to enhance the penetration of the 

administered nucleic acids, including glycols such as ethylene glycol and 
propylene glycol, pyrrols such as 2-pyrrol, azones, and terpenes such as 
limonene and menthone. 

Certain pharmaceutical compositions of the present invention may also 

10 incorporate carrier compounds. As used herein, "carrier compound" or 
"carrier" can refer to a nucleic acid, or analog thereof, which is inert (i.e., does 
not possess biological activity per se) but is recognized as a nucleic acid by in 
vivo processes that reduce the bioavailability of a nucleic acid having biological 
activity by, for example, degrading the biologically active nucleic acid or 

15 promoting its removal from circulation. The co-administration of a nucleic acid 
and a carrier compound, typically with an excess of the latter substance, can 
result in a substantial reduction of the amount of nucleic acid recovered in the 
liver, kidney or other extracirculatory reservoirs, presumably due to competition 
between the carrier compound and the nucleic acid for a common receptor. For 

20 example, the recovery of a partially phosphorothioate oligonucleotide in hepatic 
tissue can be reduced when it is coadministered with polyinosinic acid, dextran 
sulfate, polycytidic acid or ^acetamido-^ isothiocyano-stilbene-2,2'-disulfonic 
acid [Miyao et aL, Antisense Res. Dev., (1995) 5:115-121; Takakura et al., 
Antisense & Nucl. Acid Drug Dev. (1996) 6:177-183]. 

25 In contrast to a carrier compound, an "excipient" is a pharmaceutically 

acceptable solvent, suspending agent or any other pharmacologically inert 
vehicle for delivering one or more nucleic acids to an animal. The excipient 
may be liquid or solid and is selected, with the planned manner of 
administration in mind, so as to provide for the desired bulk, consistency, etc., 

30 when combined with a nucleic acid and the other components of a given 
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pharmaceutical composition. Typical excipients include, but are not limited to, 
binding agents (e.g., pregelatinized maize starch, polyvinylpyrrolidone or 
hydroxypropyl methylcellulose, etc.); fillers (e.g., lactose and other sugars, 
microcrystalline cellulose, pectin, gelatin, calcium sulfate, ethyl cellulose, 

5 polyacrylates or calcium hydrogen phosphate, etc.); lubricants (e.g., magnesium 
stearate, talc, silica, colloidal silicon dioxide, stearic acid, metallic stearates, 
hydrogenated vegetable oils, corn starch, polyethylene glycols, sodium 
benzoate, sodium acetate, etc.); disintegrants (e.g., starch, sodium starch 
glycolate, etc.); and wetting agents (e.g., sodium lauryl sulphate, etc.). 

io Pharmaceutically acceptable organic or inorganic excipient suitable for 

non-parenteral administration which do not deleteriously react with nucleic 
acids can also be used to formulate the compositions of the present invention. 
Suitable pharmaceutically acceptable carriers include, but are not limited to, 
water, salt solutions, alcohols, polyethylene glycols, gelatin, lactose, amylose, 

15 magnesium stearate, talc, silicic acid, viscous paraffin, hydroxymethylcellulose, 
polyvinylpyrrolidone and the like. 

Formulations for topical administration of nucleic acids may include 
sterile and non-sterile aqueous solutions, non-aqueous solutions in common 
solvents such as alcohols, or solutions of the nucleic acids in liquid or solid oil 

20 bases. The solutions may also contain buffers, diluents and other suitable 
additives. Pharmaceutically acceptable organic or inorganic excipients suitable 
for non-parenteral administration, which do not deleteriously react with nucleic 
acids can be used. 

Suitable pharmaceutically acceptable excipients include, but are not 
25 limited to, water, salt solutions, alcohol, polyethylene glycols, gelatin, lactose, 
amylose, magnesium stearate, talc, silicic acid, viscous paraffin, 
hydroxymethylcellulose, polyvinylpyrrolidone and the like. 

The compositions of the present invention may additionally contain other 
adjunct components conventionally found in pharmaceutical compositions, at 
30 their art-established usage levels. Thus, for example, the compositions may 
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contain additional, compatible, pharmaceuticaily-active materials such as, for 
example, antipruritics, astringents, local anesthetics or anti-inflammatory 
agents, or may contain additional materials useful in physically formulating 
various dosage forms of the compositions of the present invention, such as 
5 dyes, flavoring agents, preservatives, antioxidants, opacifiers, thickening agents 
and stabilizers. However, such materials, when added, should not unduly 
interfere with the biological activities of the components of the compositions of 
the present invention. The formulations can be sterilized and, if desired, mixed 
with auxiliary agents, e.g., lubricants, preservatives, stabilizers, wetting agents, 

10 emulsifiers, salts for influencing osmotic pressure, buffers, colorings, 
flavorings and/or aromatic substances and the like which do not deleteriously 
interact with the nucleic acid(s) of the formulation. Aqueous suspensions may 
contain substances which increase the viscosity of the suspension including, for 
example, sodium carboxymethylcellulose, sorbitol and/or dextran. The 

15 suspension may also contain stabilizers. 

The formulation of therapeutic compositions and their subsequent 
administration is believed to be within the skill of those in the art. Dosing is 
dependent on severity and responsiveness of the disease state to be treated, with 
the course of treatment lasting from several days to several months, or until a 

20 cure is effected or a diminution of the disease state is achieved. Optimal dosing 
schedules can be calculated from measurements of drug accumulation in the 
body of the patient. Persons of ordinary skill can easily determine optimum 
dosages, dosing methodologies and repetition rates. Optimum dosages may vary 
depending on the relative potency of individual oligonucleotides, and can 

25 generally be estimated based on EC50 found to be effective in in vitro and in 
vivo animal models. Persons of ordinary skill in the art can easily estimate 
dosing and repetition rates based on measured residence times and 
concentrations of the oligonucleotide in bodily fluids or tissues. Following 
successful treatment, it may be desirable to have the patient undergo 
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maintenance therapy to prevent the recurrence of the disease state, wherein the 
oligonucleotide is administered in maintenance doses. 

The methods of the present invention have evident utility in the 
diagnosis and treatment of various diseases and conditions. In addition, such 
5 methods can also be used in non-clinical applications, such as, for example, 
differential cloning, detection of rearrangements in DNA sequences as 
disclosed in U.S. Pat. No: 5,994,320, drug discovery and the like. 

The oligonucleotides generated according to the teachings of the present 
invention can be included in a diagnostic or therapeutic kit. For example, 
10 oligonucleotides sets pertaining to specific disease related transcripts can be 
packaged in a one or more containers with appropriate buffers and 
preservatives along with suitable instructions for use and used for diagnosis or 
for directing therapeutic treatment. 

Preferably, the containers include a label. Suitable containers include, 
15 for example, bottles, vials, syringes, and test tubes. The containers may be 
formed from a variety of materials such as glass or plastic. 

In addition, other additives such as stabilizers, buffers, blockers and the 
like may also be added. 

Additional objects, advantages, and novel features of the present 
20 invention will become apparent to one ordinarily skilled in the art upon 
examination of the following examples, which are not intended to be limiting. 
Additionally, each of the various embodiments and aspects of the present 
invention as delineated hereinabove and as claimed in the claims section below 
finds experimental support in the following examples. 

25 

EXAMPLES 

Reference is now made to the following examples, which together with 
the above descriptions, illustrate the invention in a non limiting fashion. 

Generally, the nomenclature used herein and the laboratory procedures 
30 utilized in the present invention include molecular, biochemical, 
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microbiological and recombinant DNA techniques. Such techniques are 
thoroughly explained in the literature. See, for example, "Molecular Cloning: 
A laboratory Manual" Sambrook et al., (1989); "Current Protocols in Molecular 
Biology" Volumes Mil Ausubel, R. M. s ed. (1994); Ausubel et al., "Current 
5 Protocols in Molecular Biology", John Wiley and Sons, Baltimore, Maryland 
(1989); Perbal, "A Practical Guide to Molecular Cloning", John Wiley & Sons, 
New York (1988); Watson et al., "Recombinant DNA", Scientific American 
Books, New York; Birren et al. (eds) "Genome Analysis: A Laboratory Manual 
Series", Vols. 1-4, Cold Spring Harbor Laboratory Press, New York (1998); 
io methodologies as set forth in U.S. Pat. Nos. 4,666,828; 4,683,202; 4,801,531; 
5,192,659 and 5,272,057; "Cell Biology: A Laboratory Handbook", Volumes I- 
III Cellis, J. E. 9 ed. (1994); "Current Protocols in Immunology" Volumes I-III 
Coligan J. E., ed. (1994); Stites et al. (eds), "Basic and Clinical Immunology" 
(8th Edition), Appleton & Lange, Norwalk, CT (1994); Mishell and Shiigi 

15 (eds), "Selected Methods in Cellular Immunology", W. H. Freeman and Co., 
New York (1980); available immunoassays are extensively described in the 
patent and scientific literature, see, for example, U.S. Pat. Nos. 3,791,932; 
3,839,153; 3,850,752; 3,850,578; 3,853,987; 3,867,517; 3,879,262; 3,901,654; 
3,935,074; 3,984,533; 3,996,345; 4,034,074; 4,098,876; 4,879,219; 5,011,771 

20 and 5,281,521; "Oligonucleotide Synthesis" Gait, M. J., ed. (1984); "Nucleic 
Acid Hybridization" Hames, B. D., and Higgins S. J., eds. (1985); 
"Transcription and Translation" Hames, B. D., and Higgins S. J., eds. (1984); 
"Animal Cell Culture" Freshney, R. I., ed. (1986); "Immobilized Cells and 
Enzymes" IRL Press, (1986); "A Practical Guide to Molecular Cloning" Perbal, 

25 B., (1984) and "Methods in Enzymology" Vol. 1-317, Academic Press; "PCR 
Protocols: A Guide To Methods And Applications", Academic Press, San 
Diego, CA (1990); Marshak et al„ "Strategies for Protein Purification and 
Characterization - A Laboratory Course Manual" CSHL Press (1996); all of 
which are incorporated by reference as if fully set forth herein. Other general 

30 references are provided throughout this document. The procedures therein are 
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believed to be well known in the art and are provided for the convenience of the 
reader. All the information contained therein is incorporated herein by 
reference. 



5 In-vitro expression substantiation of computationally retrieved naturally 

occurring antisense transcripts 
In-vitro expression assays were conducted in order to validate the 
existence of naturally occurring antisense sequences identified according to the 
teachings of the present invention. 
io Table 1 below lists polynucleotide sequence pairs that were selected for 

the in-vitro expression validation assays described in examples 1-7. 



Table 1 



Name of sense 


Sense 


Sense 


Antisense 


Anti- 


Overlap 


Start of 


Start of 


antisense pair 


transcript 


Length 


transcript 


sense 


length 


overlap 


overlap 






(nt) 




Length 


(nt) 


sense 


anti- 










(nt) 




transcript 


sense 


53BP1J76P 


53BP1 


10394 


76P 


6837 


3046 


5463 


2018 




(SEQ ID NO: 15) 




(SEQ ID NO: 16) 










CIDEB_BLTR2 (!) 


CIDEB1 


2289 


BLTR2 


6530 


2254 


17 


1 




(SEQ ID NO: 19) 




(SEQ ID NO: 21) 










CIDEBJBLTR2 (2) 


CIDEB2 


1511 


BLTR2 


6530 


1410 




1 




(SEQ ID NO: 20) 














APAF1_EBI 


aAPAFl 


7042 


EBla 


1752 


141 


6889 


1612 




(SEQ ID NO: 24) 




(SEQ ID NO: 25) 










AChR_MINK2 


AchR 


2457 


MINK2 


4863 


236 


2175 


4853 




(SEQ ID NO: 29) 




(SEQ ID NO: 30) 










M-AchR_Anti-AChR 


M-AchR 


1590 


M-Anti-AchR 


2227 


672 


934 


506 




(SEQ ID NO: 35) 




(SEQ ID NO: 36) 










CyclinE2_Anti- 
CyclinE2 


CyclinE2 


2714 


Anti-CyclinE2 


5773 


1855 


565 


2006 




(SEQ ID NO: 33) 




(SEQ ID NO: 34) 











Sequence alignments of overlapping regions of each sense-antisense pair 
15 were performed using the BLAST sequence alignment algorithm (Basic Local 
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Alignment Search Tool, available through www.ncbi.nlm.nih.gov/BLAST 
using the default parameters ) and are exhibited in Figure 5a-g. 

A microarray-based analysis was conducted, as well, in order to validate 
the existence of naturally occurring, antisense sequences identified according to 
5 the teachings of the present invention. The results are described in Example 9. 

Materials and Experimental Methods 
RNA probes generation and northern analysis 

RNA probes for northern analysis were generated by PCR amplification 

10 of a desired DNA fragment and cloning into Zero Blunt TOPO (Invitrogen 
Corp.) or pSPTl8/19 vectors (Roche Ltd.). Alternatively PCR products were 
ligated into T7 RNA polymerase promoter-containing adaptors using the 
Lignscribe kit (Ambion Europe Ltd.). Corresponding RNA transcripts were 
synthesized using T7 RNA polymerase (Roche Ltd.) and labeled with 32P-UTP 

is according to manufacturer's instructions. RNA probes were purified on Mini 
Quick Spin RNA columns. 

Commercial membranes containing Poly(A)-RNA from various human 
tissues (2 ng RNA per lane) were obtained from Origene (OriGene 
Technologies Inc.) and Ambion (Ambion Inc.). 

20 Alternatively, 2 \xg of poly(A)-RNA prepared from various human cell- 

lines were electrophoretically separated on 1 % agarose gel, and 
electrotransferred to Nytran Supercharge membrane (Schleicher & Schuell ) 
and subjected to fixing by UV radiation. Membranes were stained with 
methylene blue to ensure quantitative RNA transfer. Membranes were then 

25 prehybridized in a hybridization solution (UltraHyb solution Ambion Europe 
Ltd.) for 30 minutes at 68 °C in a rotating hybridization tube. 

Hybridization solution was then supplemented with 106 cpm of labeled 
RNA probe per each ml of hybridization solution. Blots were hybridized for 16 
hours at 68 °C in a rotating hybridization tube. Membranes were then washed 

30 twice with 2 x SSC, 0.1 % sodium dodecyl sulfate (SDS) and twice with 0.1 % 
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SDS at 68 °C. RNA transcripts signals were detected using a phosphoimager 
(Molecular Dynamics, Sunnyvale C A). 
Microarray 

Oligonucleotide design - oligonucleotide design tools (1) were applied 
5 to each pair of sense/antisense genes in order to select two complementary 60- 
mer oligonucleotides from the region where the two genes overlap. The design 
criteria included the following: low cross-homology (up to 75%) to other 
expressed sequences in the human transcriptome; a continuous hit of no more 
than 17 bp to the sequence of another gene; balanced GC content (30-70%) 

io without significant windows of local imbalance; no more than 2 palindromes 
with a length of 6 bp; a hit of no more than 15 bp to a repeat, vector or low- 
complexity region; and no long stretches of identical nucleotides. 

Microarray preparation - 60-mer oligonucleotides were synthesized by 
Sigma-Genosys (The Woodlands, TX), resuspended at 40 yM in 3X SSC, and 

15 spotted in quadruplicates on poly-L-lysine coated glass slides as detailed in the 
online protocol of the National Human Genome Research Institute 
(http://www*nhgri.nih.gov/DIR/Microairay/Protocols.pdf). To avoid local 
differences in the hybridization conditions, the probes selected from the 
overlapping regions of each sense/antisense pair were spotted in the same 

20 block, next to each other. 

Human cell lines - The following cell lines utilized were purchased 
from ATCC (Manassas, VA): MCF7 (breast adenocarcinoma, Cat. No. HTB- 
22,), HeLa (cervical adenocarcinoma, Cat. No. CCL-2) HEK-293 (embryonal 
kidney cells, Cat. No. CRL-1573), Jurkat (acute T-cell leukemia, Cat. No* TD3- 

25 152), K-562 (chronic myelogenous leukemia, Cat. No. CCL-243), HepG2 (liver 
carcinoma, Cat. No. HB-8065), T24 (urinary bladder carcinoma, Cat. No. HTB- 
4), SK-N-DZ (neuroblastoma, Cat. No. CRL-2149), NK-92 (non-Hodgkin's 
lymphoma, Cat. No. CRL-2407), MG-63 (osteosarcoma, Cat. No. CRL-1427), 
DU 145 (prostatic carcinoma, Cat. No. HTB-81), G-361 (melanoma, Cat. No. 

30 CRL-1424), PANC-1 (pancreatic carcinoma, Cat. No. CRL-1469), ES-2 (ovary 
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clear cell carcinoma, Cat No. CRL-1978), Y79 (retinoblastoma, Cat. No. HTB- 
18), HT-29 (colorectal adenocarcinoma, Cat. No. HTB-38), H1299 (large cell 
lung carcinoma, Cat. No. CRL-5803), SNU1 (gastric carcinoma, Cat. No. 
CRL-5971), NL564 (EBV-transformed human lymphoblasts) and MCF10 
5 (benign tumor breast cells). 

RNA purification - Total RNA was extracted from the above mentioned 
human cell lines using TriReagent (Molecular Research Center, Cincinnati, 
OH). Poly(A)+ mRNA was purified using two cycles of the Dynabeads mRNA 
Purification Kit (Dynal Biotech ASA, Oslo, Norway), as per manufacturer 

io instructions. The removal of traces of ribosomal RNA was confirmed by 
agarose gel electrophoresis. Poly(A)+ mRNAs from human testis, placenta, 
lung and brain tissue were purchased from BioChain Institute, Inc. (Hayward, 
CA). mRNAs of all cell lines described above were combined in equal 
quantities to obtain the reference 'mRNA pool'. 

15 Preparation of labeled cDNA - For each hybridization, labeled cDNA 

was synthesized by reverse transcription of 0.5 \xg of mRNA, in the presence of 
100 pmol of random 9-mers, Ijig of oligo(dT)20, IX RT buffer, 10 mM DTT, 3 
nmol of Cy5- or Cy3-conjugated dUTP, 0.5 mM of dATP, dGTP and dCTP, 
and 0.2 mM dTTP, in a final volume of 40 |il (Amersham). The reaction 

20 mixture was incubated for 5 minutes at 65 °C and cooled to 42 °C. 600 Units 
of reverse transcriptase (Superscript II, Invitrogen, Carlsbad, CA) and 40 U of 
Rnase inhibitor (RNasin Promega, Madison, WI) were added and the reaction 
was incubated for 30 minutes at 42 °C. An additional 200 U of Superscript II 
were added and the reaction was incubated for another 15 minutes. Remaining 

25 RNA was degraded by the addition of 200 mM NaOH and 50 mM EDTA, at 65 
°C for 10 minutes. The mixture was neutralized by adding half a volume of 1M 
Tris-HCl pH 7.5. Hybridizations were performed in duplicate using fluorescent 
reversal of Cy3- and Cy5-labeled cDNA from test cell mRNAs and pooled 
mRNAs. Pairs of Cy5/Cy3-Iabeled cDNA samples were combined, and 
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subsequently purified and concentrated to a final volume of 5-7 \i\ using a 
Microcon-30 (Millipore) concentrator. 

Hybridization and washing conditions - Microarray slides were 
prehybridized with 40 |il of 5X SSC, 0.1 % SDS and 1 % BSA for 30 min at 42 
5 °C 5 washed for 2 minutes with double distilled water, then rinsed with 
isopropanol, and spun dried at 500 g for 3 minutes. Prior to hybridization, the 
labeled probe was combined with 10 ng of Cot-1 DNA, 10 fig poly(dA)80, and 
4 fig yeast tRNA, in a final volume of 15 fxl. The mixture was denatured at 100 
°C for 3 minutes and placed on ice. Formamide (final concentration 16 %), SSC 

10 (to 5X concentration) and 0.1 % SDS were added to a final volume of 30 fil. 
The mixture was placed on the array under a glass cover slip in a tightly sealed 
hybridization chamber, and immersed in a water bath at 42 °C, for 16 hours. 
Microarray slides were then washed for 4 minutes with 2X SSC, 0.1 % SDS; 4 
minutes with IX SSC, 0.01 % SDS; 4 minutes with 0.2X SSC and 15 seconds 

15 with 0.05X SSC and spun dry by centrifugation for 3 minutes at 500g. 

Image processing - Following hybridization, arrays were scanned using 
a GenePix 4000B scanner (Axon Instruments, Union City, CA). Scanned array 
images were manually inspected and areas with visible artifacts or deformities 
were marked. Images were processed using GenePix Pro 3.0 (www.axon.com) 

20 software. 

Normalization - The intensity for each spot was calculated as its mean 
intensity minus the median background around the spot. The signal for each 
oligo was calculated as the average of intensity values of the four redundant 
spots of each oligo. Normalization of the oligo signals was performed at 

25 several levels as is further described below* 

Normalization of blocks was carried out in order to normalize the 
gradient of intensities within each slide. For each block i, an Ai parameter was 
calculated as the average of intensities of 56 positive control spots 
(oligonucleotide probes for the ubiquitously expressed housekeeping genes 

30 gapdh, actin, hsp70 and gnb211, in various probe concentrations). An average 
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A of all Ai averages was calculated. Based on this, a block normalization 
factor Bi was calculated for each block, as Bi = A/Ai, and applied to each spot 
in the block. 

Normalization between slides was performed to bring all experiments to 
5 the same scale. For each experiment, the average of intensities of the 192 
negative control spots on the array was set to be the 0 (zero) of the new scale. 
For a subset of highly signaling oligos, with intensities between the 70th and 
the 95th percentiles of the oligo signal distribution of the experiment, the 
average was arbitrarily set to be 500 in the new scale. The intensity of each 

jo oligo signal was accordingly converted to this new scale, to obtain the 
normalized signal. A ratio between the normalized cell-line signal and the 
normalized pool signal was calculated for each oligo in each experiment. To 
avoid misleading ratios coming from signals that were too low, the ratio Rji for 
oligo j in experiment i was calculated as: Rji ~ max [100, celMine-signalji]/max 

15 [100, pool-signalj i] . 

To normalize between red/green intensities in reciprocal experiments, 
the ratio Rjk for oligo j in cell-line k was calculated as the average of calculated 
ratios Rji between the two reciprocal experiments of the cell-line k. In cases 
where only one of the two reciprocal experiments showed an elevated or 

20 decreased ratio, while in the other the ratio was 1.0, the average Rjk was 
converted to 1.0. 

The actual pool signal for each oligo was calculated to be the average of 
the normalized oligo signals in the pool channel of all experiments. A virtual 
pool signal was calculated as the average of the normalized oligo signals in the 
25 eel Wine channel of all experiments. The virtual pool signals were found to be 
very close to the actual pool signals, indicating consistency in the analysis. 

Threshold determination - To determine an expression threshold above, 
in which a normalized signal would be considered a 'positive 4 signal indicating 
expression, the distribution of all 16,512 normalized negative control signals 
30 and the standard deviation (neg-std-dev) were calculated. The neg-std-dev 
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obtained was 38. An oligo j was considered 'present' in a cell-line k if Rjk x 
actual-pool-signalj > 4 x neg-std-dev. 

EXAMPLE 1 

5 Identification of53BPl and 76PRNA transcripts in a variety of human 

tissues and cell-lines 

Background: 

The tumor suppressor p53 binding protein 1 (SEQ ID NO: 15) is one of 
the various p53 target proteins. It binds to the DNA-binding domain of p53 and 

io enhances p5 3 -mediated transcriptional activation. 53BP1 is characterized by 
several structural motifs shared by several proteins involved in DNA repair 
and/or DNA damage-signaling pathways. 53BP1 becomes 

hyperphosphorylated and forms discrete nuclear foci in response to DNA 
damage induced by radiation and chemotherapy. Recent reports suggest that 

15 53BP1 is an ataxia telangiectasia mutated (ATM) substrate that is involved 
early in the DNA damage-signaling pathways in mammalian cells, attributing a 
role to 53BP1 in the development of various mammalian pathologies. 
Results: 

Two 53BP1 RNA sense transcripts with dissimilar 3' UTRs were 
20 previously described [Iwabuchi K. et al. (1994) Proc. Natl. Acad. Sci. USA] 
and are illustrated in Figure 6 (red and green). Leads™ assembly program 
modified to uncover novel antisense transcripts was used to uncover three such 
transcripts for the 53BP1 gene, which transcripts have different 3 f UTRs (SEQ 
ID NO: 16, 37 and 38) and encode the 76p gene product (Genbank accession 
25 number NM0 1 4444)(illustrated in blue). 

To confirm expression of computationally retrieved antisense transcripts, 
two RNA-probes were generated. Schematic location of the probes used for 
sense and antisense validation (Riboprobe#l and Riboprobe#2, respectively 
SEQ ID NO: 17 and 18, respectively) is illustrated in Figure 6. These RNA 
30 probes were used to identify the corresponding full-length transcripts. 
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As shown in Figure 7, Riboprobe#l detected two transcripts of 
approximately 63 Kb and 10.5 Kb, corresponding to the sense mRNA. The 
absolute levels of the short messenger were rather homogeneous in all cell-lines 
examined. The 10.5 Kb variant exhibited a more heterogenic pattern of cellular 
5 distribution, and was mostly expressed in K562, MG-63, 293 HEK and Hela 
cells. In general, the longer sense transcript which is an alternatively 
polyadenylated variant was markedly lower expressed in the various cell lines 
examined. 

The same membrane was used to perform northern analysis with 
io Riboprobe#2 in order to validate expression of antisense transcripts of 53BP1. 
Results are shown in Figure 8. Three variants corresponding to the 76p gene 
were detected in most of the cell lines: 6.8 Kb, 4.2 Kb and 2.5 Kb. Minor 
fluctuations of expression were observed and the largest transcript was 
expressed at significantly higher levels than the smaller transcripts, 
is A sense strand probe was used to detect expression of the antisense 

transcripts in a variety of human tissues (Figure 9). The three alternatively 
polyadenylated variants with different 3' UTRs were expressed in most of the 
tissues. Total levels of these transcripts varied in the different tissues assayed. 
For example, highest level of expression for all three transcripts was observed 
20 in the brain and testis, while no expression of the 6.8 Kb and 4.2 Kb variants 
was detected in the spleen. Expression levels of each transcript were 
summarized in Table 2 below. 

Table 2 





Transcript Mol. Weight (Kb) 


Tissue 


6.8 


4.2 


2.5 


brain 


+-H- 


HII 


III] 


colon 


+ 


++ 


+ 


heart 




+ 


■++ 


kidney 


++ 


++ 




Liver 






+ 


lung 


++++ 




+ 
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muscle 


-H- 


+ 




placenta 


+ 


4+ 


++ 


Small intestine. 


++ 


++ 




spleen 






+ 


stomach 








testis 


++ 


++ 





Reverse transcription amplification (RT-PCR) analysis was performed in 
order to substantiate the northern blot results. Primers were synthesized 
according to the scheme shown in Figure 10 (indicated by arrows). The 
5 expected amplification products corresponded completely to the observed 
amplification reaction products, supporting the existence of the various 53BP1 
and 76p transcription variants. 

EXAMPLE 2 

l o Identification ofmRNA and complementary transcripts of the Cell death 

inducing DFF45~like effector (CIDE)-B 
Background: 

Cell death inducing DFF45-like effector (CIDE-B) (GenBank Accession 
numbers AF 190901 and AF218586) is a member of a novel family of 

15 apoptosis-inducing factors that share homology with the N-terminal region of 
DFF, the DNA fragmentation factor. Although the molecular mechanism of 
CIDE-B induced apoptosis in unclear, mitochondrial localization and 
dimerization, both where shown to be required [Chen Z. et al. (2000) J. BioL 
Chem. 275:22619-22622], Notably, over-expression of CIDE-B in mammalian 

20 cells shows strong ceil death-inducing activity, suggesting that aberrant 
expression of this protein may be associated with a number of mammalian 
pathologies [InoharaN. et al. (1998) EMBO J. 17:2526-2533]. 
Results: 

Two sense transcript of the CIDE-B gene were previously described 
25 with different 5' UTRs [Inohara N. et al. (1998) EMBO J. 17:2526-2533 and 
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Lugovskoy AA. et al. (1999) Cell 99:745-755] (SEQ ID NOs: 19 and 20). 
Computational analysis recovered a potential elongated BLTR2 transcript (SEQ 
ID NO: 21), showing full complementary to the CIDE-B mRNA transcripts 
(Figure 11). 

5 Northern blot analysis was done in order to determine the distribution of 

the CIDE-B sense and antisense transcripts in various cell-lines. A 430 base 
pairs DNA fragment was selected to generate RNA probes for identification of 
both sense and antisense transcripts (SEQ ID NOs: 22 and 23, respectively). 

Expression of antisense mRNA transcripts was detected in various cell- 
io lines and especially in the mammary gland adenocarcinome cell line-MCF-7 as 
a predominant 6.5 Kb transcript, although higher forms were also visualized 
(Figure 12). Low hybridization with a CIDE-B probe was detected (Figure 13). 
Conclusion: 

BLTR2 was recently identified as a putative seven-transmembrane 
is receptor with a high homology to the Leukotriene B (4) receptor [Tryselius Y. 
et al. (2000) Biochem. Biophys. Res. Commun. 274:377-82]. Although the 
mechanism of action of BLTR2 is poorly understood, it is conceivable that 
BLTR2 mRNA plays a role in the regulation of CIDE-B apoptotic effector and 
vice versa. 

20 

EXAMPLE 3 

Identification of mRNA and complementary transcripts of the apoptosis 

inducing factor APAF-1 

Background: 

25 A conserved series of events including cellular shrinkage, nuclear 

condensation, externalization of plasma membrane phosphatidyl serine, and 
oligonucleosomal DNA fragmentation characterizes apoptotic cell death. 
Regardless of the circumstance, induction and execution of apoptotic events 
require activation of caspases, a family of aspartate-specific cysteine 

30 proteinases. Caspase activation may be regulated by the mitochondrion and 
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specifically by the apoptosome consisting of an oligomeric complex of 
apoptotic protease-activating factor- 1 (APAF-1), cytochrome C and dATP. The 
apoptosome recruits and activates caspase-9, which in turn activates the 
executioner caspases, caspase-3 and -7. The active executioners kill the cell by 
5 proteolysis of key cellular substrates [Zou H. et al. (1999) J. Biol. Chem. 
274:11549-11556], Evasion or inactivation of the mitochondrial apoptosis 
pathway may contribute to oncogenesis by allowing cell proliferation. In this 
instance, unregulated cell proliferation may occur by inactivation of APAF-1, 
which has been suggested to occur via genetic loss or inhibition by HSP-70 and 
10 HSP-90. Although aberrant expression of APAF-1 was found in a variety of 
malignancies (including ovarian epithelial cancer), no link was found to 
accelerated protein degradation. 
Results: 

One RNA transcript has been previously described for APAF-1 [ Zou H. 

15 et al. (1999) J. Biol. Chem. 274:1 1549- 11 556] (SEQ ID NO: 10) (SEQ ID NO: 
24). Computational search for natural antisense transcripts has revealed two 
complementary transcripts for APAF-1 messenger RNA (SEQ ID NOs: 25 and 
26). These antisense transcripts include an open reading frame encoding the 
EB-1 gene (GenBank accession numbers AF 145204; AF 164792). The overlap 

20 between the APAF-1 messenger RNA and the longer antisense transcript is of 
at least 300 nucleotides. 

To validate expression of computationally retrieved antisense transcripts 
for APAF-1, as well as expression of APAF-1 mRNA in the assayed human 
cell lines, RNA-probes of 366 ribonucleotides were generated (sense and 

25 antisense strands, respectively). Schematic location of the probes used for 
sense and antisense validation (Riboprobe#l and Riboprobe#2, SEQ ID NOs: 
27 and 28, respectively) is illustrated in Figure 14, 

As shown in Figure 15a, the sense RNA probe directed at visualizing the 
antisense transcripts, identified a clear band of 3 Kb corresponding to the long 

30 computationally retrieved antisense transcript as well as other transcripts sizing 
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from 1 Kb to 8 Kb (Figure 15a). Transcripts were essentially found in all cell 
lines but especially in 293 HEK and LN-Cap lines. 

Hybridization with an RNA probe directed at visualizing the mRNA 
transcript of APAF-1 resulted only in a blurred patterns (Figure 15b). 
5 However, a 7 Kb mRNA transcript consistent with APAF-1 mRNA was seen in 
Ln Cap and 293 HEK cell lines. 

Conclusion: 

A reciprocal pattern of expression was observed for both APAF-1 and 
EB-1 transcripts, exhibiting an interesting expressional relationship between the 
10 sense and antisense transcripts suggesting antisense-mediated expression 
regulation. 

EXAMPLE 4 

mRNA expression of muscle nicotinic Acetyl-Choline Receptor e sub unit and 
1 5 its complementary MINK transcript 

Background: 

The muscle nicotinic Acetylcholine Receptor e subunit (AChRs) encodes 
for one of five subunits of a ligand gated ion channel receptor located at the 
neuromuscular synapse. AChRe is up-regulated in the postnatal period when it 

20 replaces y subunit of the receptor [Witzamann, V. et al., (1987) FEBS Lett. 223, 
104-112]. It is also up-regulated in synapse development, specifically by the 
trophic factor neuregulin [Martinou J. C. (1991) Pro. Natl. Acad. Sci. USA 88, 
7669-7673], In an attempt to decipher AchRe function and mechanism of 
regulation, computational screen for AChRs K complementary transcript was 

25 carried out. 

Results: 

One mRNA transcript of AChRe gene was previously described [Beeson 
D. Eur. J. Biochem (1993) 215, 229-238] (SEQ ID NO: 29). Computational 
analysis recovered a complementary transcript belonging to Mink, a new 
30 member of the germinal center kinase (GCK) family (SEQ ID NO: 30) [Dan I. 
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FEBS Lett. (2000) 469, 19-23] showing an overlap of at least 280 nucleotides 
to the AchRe mRNA, as schematically illustrated in Figure 16. 

To validate the overlap of the two genes and to learn about their tissue 
distribution, northern analysis of a variety of human tissues was performed. 
Poly(A)-RNA containing membrane was hybridized with a 280 nucleotides 
RNA probes, corresponding to the overlap region in either antisense or sense 
orientation (SEQ ID NOs: 3 1 and 32, respectively). 

As is evident from Figure 17a an AChRe transcript was expressed as a 
predominant 4 Kb band and had the highest expression in the heart, kidney and 
brain while surprisingly only a limited expression was observed in the skeletal 
muscle. 

Hybridization with a MINK specific RNA probe revealed a major 
transcript of about 5 Kb, in accordance with previous results [Dan L FEBS Lett. 
(2000) 469, 19-23] (Figure 17b). The mRNA transcript was ubiquitously 
expressed with strongest expression found in brain, liver, thymus, spleen and 
pancreas, again in agreement with Dan I. et al. 

Conclusion: 

The finding that AChRs and Mink genes are antisense each to one 
another with a significant overlap, and the fact that the two genes are co- 
expressed in some tissues (eg., brain) suggest the possibility that one of them 
may regulate the other under certain conditions. 

EXAMPLE 5 

Expression of Cyclin E2 mRNA and complementary transcripts in a variety 

of human celt-lines 

Background: 

The human cyclin E2 gene encodes a 404-amino-acid protein that is 
most closely related to cyclin E. Cyclin E2 associates with Cdk2 in a functional 
kinase complex that is inhibited by both p27(Kipl) and p21(Cipl). The 
catalytic activity associated with cyclin E2 complexes is cell cycle regulated 
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and peaks at the Gl/S transition. Overexpression of cyclin E2 in mammalian 
cells accelerates cell-cycle progression. Unlike cyclin El, cyclin E2 levels are 
low to undetectable in nontransformed cells and increase significantly in tumor- 
derived cells suggesting specific mechanism of regulation. 
5 Results: 

One RNA transcript was found for cyclin E2 (SEQ ID NO: 33. 
Computational search for natural antisense transcripts has revealed one 
complementary transcript for cyclin E2 messenger RNA (SEQ ID NO: 34). 
The overlap between the cyclin E2 sense RNA and the antisense transcript is of 

10 at least 72 nucleotides. 

To confirm expression of the computationally retrieved antisense 
transcript for cyclin E2 as well as of cyclin E2 mRNA in human cell lines, two 
RNA-probes of 800 ribonucleotides were generated. Schematic location of the 
probes used for sense and antisense validation (SEQ ID NO: 44, Riboprobe#l 

1 5 is illustrated in Figure 1 8). 

As shown in Figure 19a, Riboprobe#l detected two transcripts of 
approximately 3 Kb and 4.3 Kb. The absolute levels of the transcripts were 
quite heterogenic in all cell-lines examined. Both transcripts were completely 
absent from the Ln Cap cell line, while significantly high expression was 

20 observed in MCF-7 and DLD-1 lines, especially of the short transcript. 

The same membrane was used to perform northern analysis with 
Riboprobe#2 in order to validate expression of antisense transcripts of cyclin 
E2. As is evident from Figure 19b, an antisense transcript 3.8 Kb long was 
observed in most cells assayed. Significantly high pattern of expression was 

25 observed in K562, MCF-7 and DLD-1 cell lines, while only a very moderate 
level of expression was detected in Ln Cap and HepG2 cell lines. 
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EXAMPLE 6 

Co-regulated expression of CIDE-B and its complementary transcript upon 

induction of apoptosis 
The discovery of a novel naturally occurring antisense transcript to the 
5 apoptosis inducing factor, CDDE-B (see Example 2 hereinabove), suggested that 
the latter may be regulated by its complementary transcript, thereby establishing 
a novel mechanism of regulation. To address this, differential expression 
analysis of CIDE-B expression and its endogenous antisense transcript 
expression was performed following induction of apoptosis. 
l o Materials and methods 

Induction of apoptosis and reverse transcription analysis - 
Monolayers of 293 cells were either left untreated (UT) or incubated 
with increasing concentrations of etoposide or staurosporine (Sigma IL). 
Twenty-four hours following addition of the drug, total RNA was extracted as 
15 decribed hereinabove. Purified RNA was further treated with DNaseL A 
reverse transcription reaction were carried out with equivalent amounts of RNA 
in a final volume of 20 \il containing 100 pmol of the oligo(dT) primer, 250 ng 
of total RNA, 0.5 mM each of four deoxynucleoside triphosphates and 5 units 
of reverse transcriptase. The reaction mixture was incubated at 65 °C for 5 min, 
20 42 °C for 50 min and 70 °C for 15 min. PCR was carried out in a final volume 
of 25 ^1 containing 12.5 pmol each of the oligonucleotide primers derived of 
exons 3 and 7 of CEDE-B (SEQ ID NOs: 39 and 40), 1 \il of RT solution and 
1.75 units of Taq polymerase. Amplification was carried out by an initial 
denaturation step at 94 °C for 5 min followed by 35 cycles of [94 °C for 30 s, 68 
25 °C for 30 s, and 68 °C for 130 min]. At the end of the PCR amplification, 
products were analyzed on agarose gels stained with ethidium bromide and 
visualized with UV light. 
Results 

Amplification reaction yielded two major PCR products of 740 bp and 
30 2285 bp (Figure 20). The small (740 bp) PCR product derived from the sense 
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(CIDE-B) strand, whereas the larger (2285 bp) product represented an 
intronless antisense transcript. Evidently, an increase of sense transcript, 
concomitant with a decrease of antisense transcript, was observed following 
treatment with etoposide (lanes 1-4) as compared to untreated cells (lane 9), 
5 while no change was detected following staurosporine treatment (lanes 5-8). 

These results suggest that following induction of apoptosis, antisense 
regulation of CIDE-B is abolished thereby allowing CIDE-B mediated 
apoptosis to proceed. 



10 EXAMPLE 7 

Reciprocal variation in sense and antisense expression of mouse nicotinic 
acetylcholine receptor, epsilon subunit during differentiation 
The mouse nicotinic acetylcholine receptor, epsilon (mAchRe) subunit 
(SEQ ID NO: 35) has a critical function in a variety of differentiation 
15 processes. To address a novel concept of antisense regulation of AchRe- 
mediated differentiation, expression patterns of AchRe and its naturally 
occurring antisense transcript (SEQ ID NO: 36) were examined following 
induction of differentiation. 

Materials and methods 
20 Induction of apoptosis and reverse transcription analysis - C2 mouse 

myoblast cells were incubated with a differentiation medium (Dulbecco's 
modified Eagle's medium (DMEM) including 10 pg/ml insulin and 10 ng/ml 
transferring) or control medium (untreated) for 48 and 72 hours. Total RNA 
was extracted from treated and control cells and reverse-transcribed. PCR was 
25 done using F4 and R3 primers, derived from exon numbers 10 and 12 (last 
exon, SEQ ID NOs: 41 and 42, respectively) of the mouse nicotinic 
acetylcholine receptor, epsilon subunit (mAChRe) and directed at detecting 
sense and antisense transcripts (see Figure 21a). 
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Results 

Amplification reaction showed a gradual increase in AchRe transcript 
expression, concomitant with the differentiation state of the cells. A second 
amplification product, which corresponded to an unspliced transcript was seen 
in untreated cells and disappeared following induction of differentiation. This 
fragment corresponds to a putative antisense transcript of the AchRs, and may 
represent an alternative 3' UTR of the Mink gene , of which the known 
transcript terminates 400 bp downstream to AchRe (see Example 4). To 
overcome possible competition between the two transcripts, another PCR 
reaction was carried out using antisense specific riboprobes F4 and R4 (SEQ ID 
NO: 43), Reverse transcription products of this amlification reaction showed a 
single band which corresponded to a naturally occurring antisense transcript of 
the AchRs. As expected this transcript disappeared following induction of 
differentiation. 

These results imply inverse regulation of the AchRs and its naturally 
occurring antisense transcript, during muscle cells differentiation from 
myoblasts to myotubes. Regulation may proceed, possibly through 
complementation of the sense and antisense transcripts to form dsRNA which 
can serve as a substrate for double strand RNA processing enzymes such as 
RNase H. 

EXAMPLE 8 

A polynucleotide database of sequences corresponding to the naturally 
occurring antisense transcripts identified by the present invention and their 

complementary sense sequences 
Naturally occurring antisense sequences identified according to the 
teachings of the present invention and their corresponding sense sequences are 
provided in the CD-ROMs enclosed herewith (file content: CD-ROM1 
includes a ,f seq M text file which contains the actual polynucleotide sequences, 
and a "table" file which contains summarized data pertaining to each sense- 
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antisense sequence pair. CD-ROM2 includes an "aligments" file which contains 
sequence alignments of sense and antisense overlapping regions. CD-ROMS 
contains Excel files: "Table SI" and "Table S2", further described in Example 
9. 

5 Table 3 below exemplifies the format of the Table provided in CD- 

ROM1. Each row represents a pair of transcripts. The columns of Table 3 
represent (from the left): the serial number of the pair, the name of the first 
transcript, its length in nucleotides, the name of the second transcript, its length 
in nucleotides, the number of base pairs that overlap between the two 
10 transcripts, offsets of overlap beginning at the first transcript, offsets of overlap 
beginning at the second transcript. 



Table 3 



Serial 


First 


First 


Second 


Second 


Overlap 


Start of overlap 


No, 


transcript 


transcript 


transcript 


transcript 


length 


In first / 






length (nt) 




length (nt) 


(nt) 


in second 














transcript 


570 J) 


AV705532JD 


190 


Z44352_15 


783 


OL: 52 


OF1:1 OF2: 1 




(SEQ ID NO: 1) 




(SEQ ID NO: 2) 








570_1 


AV705532_0 


190 


Z44352_14 


1649 


OL: 52 


OF1:1 OF2:1 








(SEQ ID NO: 3) 








570_2 


AV705532J3 


190 


Z44352_13 


1861 


OL:52 


OF1: 1 OF2: 1 








(SEQ ID NO: 4) 








571_0 


AW070860J) 


214 


T81 142.7 


1934 


OL: 54 


OF1: 1 OF2: 1162 




(SEQ ID NO: 5) 




(SEQ ID NO: 6) 








571_1 


AW070860_0 


214 


T81142_6 


2353 


OL: 54 


OF1: 1 OF2: 1162 








(SEQ ID NO: 7) 








571_2 


AW070860J) 


214- 


T81142_4 


2500 


OL: 54 


OF1:1 OF2:1264 








(SEQ ID NO: 8) 








571J3 


AW070860_0 


214 


T81 142J3 


947 


OL:54 


OF1: 1 OF2: 171 








(SEQ ID NO: 9) 








571_4 


AW070860J) 


214 


T81 142_2 


1366 


OL: 54 


OFV.1 OF2: 171 








(SEQ ID NO: 10) 








572 J3 


BE046369J3 


422 


W26553_3 


1532 


OL: 52 


OF1:1 OF2: 1532 




(SEQ ID NO: 11) 




(SEQ ID NO: 12) 








572_1 


BE046369_0 


422 


W26553_2 


1753 


OL: 52 


OF1: 1 OF2: 1753 








(SEQ ID NO: 13) 
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572_2 


BE046369J) 


422 


W26553J1 


1832 


OL: 52 


OF1: 1 OF2: 1832 








(SEQ ID NO: 14) 









Pairs of transcripts are numbered, (within a contig pair, right to the underscore) that belong to a pair of 



contigs (numbered left to the underscore). Transcript names are arbitrary designataions. 

Sequence alignment of the overlapping region in each sense and 
5 antisense pair of Table 1 is demonstrated in Figure 4a-k. Alignments were 
performed using the BLAST sequence alignment algorithm (Basic Local 
Alignment Search Tool, available through www.ncbi.nlm.nih.gov/BLAST). 
Interestingly, alignment profile shows high level of variability with regard to 
overlap lengths. It is conceivable that short overlaps are due to technical 
10 reasons associated with insufficient sequence data. 

The putative naturally occurring antisense transcripts identified by the 
present invention and disclosed in the enclosed CD-ROMs can be used to detect 
and/or treat a variety of diseases, disorders or conditions, examples of which 
are listed hereinunder. For example, antisense transcripts or sequence 
15 information derived therefrom can be used to construct microarray kits 
(described in details in the preferred embodiments section) dedicated to 
diagnosing specific diseases, disorders or conditions. 

The following sections list examples of proteins (subsection i), based on 
their molecular function, which participate in variety of diseases (listed in 
20 subsection ii), which diseases can be diagnosed/treated using information 
derived from naturally occurring antisense transcripts such as those uncovered 
by the present invention. 

/. Molecular function 

defense/immunity proteins 
25 Information derived from proteins involved in the immune and 

complement systems, such as acute-phase response proteins, antimicrobial 
peptides, antiviral response proteins, blood coagulation factors, complement 
components, immunoglobulins, major histocompatibility complex antigens, and 
opsonins can be used to diagnose/treat diseases involving the immunological 
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system including inflammation, autoimmune diseases, infectious diseases, as 
well as cancerous processes. Diseases which are manifested by non-normal 
coagulation processes, which may include abnormal bleeding or excessive 
coagulation. 

5 Immunoglobulins 

Information derived from proteins involved in the immune and complement 
systems including antigens and autoantigens, immunoglobulins, MHC and HLA 
proteins and their associated proteins can be used to diagnose/treat diseases 
involving the immunological system including inflammation, autoimmune 

io diseases, infectious diseases, as well as cancerous processes. 
Nucleotide binding proteins 

Information derived from ligand binding or carrier proteins can be used 
to diagnose/treat diseases involving dysregulated expression, activity or 
localization of nucleotide binding proteins. 
1 5 Nucleic acid binding proteins 

Information derived from proteins involved in RNA and DNA synthesis 
and expression regulation, such as transcription factors, RNA and DNA binding 
proteins, zinc fingers, helicase, isomerase, histones, nucleases, 
ribonucleoproteins, transcription and translation factors and others can be used 
20 to diagnose/treat diseases involving DNA or RNA binding proteins such as: 
helicases, isomerases, histones and nucleases, for example diseases where there 
is non-normal replication or transcription of DNA and RNA respectively. 

RNA polymerase II transcription factors 

Information derived from proteins such as specific and non-specific 
25 RNA polymerase II transcription factors, enhancer binding, ligand-regulated 
transcription factor and general RNA polymerase II transcription factors can be 
used to diagnose/treat diseases involving RNA polymerase II transcription 
factors, for example disorders involving abnormal transcription of RNA. 
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RNA binding proteins 

Information derived from RNA binding proteins involved in splicing and 
translation regulation, such as tRNA binding proteins, RNA helicases, double- 
stranded RNA and single-stranded RNA binding proteins, mRNA binding 
5 proteins, snRNA cap binding proteins, 5S RNA and 7S RNA binding proteins, 
poly-pyrimidine tract binding proteins, snRNA binding proteins, and AU- 
specific RNA binding proteins can be used to diagnose/treat diseases involving 
transcription and translation factors such as: helicases, isomerases, histones and 
nucleases, for example diseases where there is non-normal transcription, 
10 splicing, post-transcriptional processing, translation or stability of the RNA. 

Chaperones 

Information derived from proteins such as ribosomal chaperone, 
peptidylprolyl isomerase, lectin-binding chaperone, nucleosome assembly 
chaperone, chaperonin ATPase, cochaperone, heat shock protein, 

15 HSP70/HSP90 organizing protein, fimbrial chaperone, metallochaperone, 
tubulin folding, HSC70-interacting protein can be used to diagnose/treat 
diseases involving pathological conditions, which are associated with non- 
normal protein activity or structure. Binding of the products of the proteins of 
this family, or antibodies reactive therewith, can modulate a plurality of protein 

20 activities as well as change protein structure. Alternatively, diseases in which 
there is abnormal degradation of other proteins, which may cause non-normal 
accuniulation of various proteinaceous products in cells, caused non- normal 
(prolonged or shortened) activity of proteins, etc. 
Motor proteins 

25 Information derived from proteins that generate force or energy by the 

hydrolysis of ATP and that function in the production of intracellular 
movement or transportation including microfilament motor, axonemal motor, 
microtubule motor, kinetochore motor (like dynein, kinesin, or myosin) can be 
used to diagnose/treat diseases involving un-normal chemotactic movement or 
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motor dependent macromolecule operation such as of dynamin, which affects 
the regulated endocytic process. 
Actin binding proteins 

Information derived from actin binding proteins, such as actin cross- 
5 linking, actin bundling, F-actin capping, actin monomer binding, actin lateral 
binding, actin depolymerizing, actin monomer sequestering, actin filament 
severing, actin modulating, membrane associated actin binding, actin thin 
filament length regulation and actin polymerizing proteins can be used to 
diagnose/treat diseases involving cytoskeletal malformations, aberrant cellular 
io morphology affecting extracellular interactions and dysregulated intracellular 
signaling. 

Enzymes 

Information derived from proteins possessing enzymatic activities, such 
as mannosylphosphate transferase, para- 

15 hydroxybenzoaterpolyprenyltransferase, Rieske iron-sulfur protein, 
imidazoleglycerol-phosphate synthase, sphingosine hydroxylase , tRNA 2'- 
phosphotransferase, sterol C-24(28) reductase, C-8 sterol isomerase, C-22 
sterol desaturase, C-14 sterol reductase, C-3 sterol dehydrogenase (C~4 sterol 
decarboxylase) , 3-keto sterol reductase, C-4 methyl sterol oxidase, 

20 dihydronicotinamide riboside quinone reductase, glutamate phosphate 
reductase, DNA repair enzyme, telomerase, alpha-ketoacid dehydrogenase, 
beta-alanyl-dopamine synthase, RNA editase, aldo-keto reductase, alkylbase 
DNA glycosidase, glycogen debranching enzyme, dihydropterin deaminase, 
dihydropterin oxidase, dimethylnitrosamine demethylase, ecdysteroid UDP- 

25 glucosyl/UDP glucuronosyl transferase, glycine cleavage system, helicase, 
histone deacetylase, mevaldate reductase, monooxygenase, poly(ADP-ribose) 
glycohydrolase, pyruvate dehydrogenase, serine esterase, sterol carrier protein 
X-related thiolase, transposase , tyramine-beta hydroxylase, para- 
aminobenzoic acid (PABA) synthase, glu-tRNA(gln) amidotransferase, 

30 molybdopterin cofactor sulfurase, lanosterol 14-alpha-demethylase, aromatase, 
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4-hydroxybenzoate octaprenyltransferase, 7,8-dUiydro-8-oxoguanine- 
triphosphatase, CDP-alcohol phosphotransferase, 2,5-diamino-6- 
(ribosylamino)-4(3H)-pyrimidonone 5 ? -phosphate deaminase, diphosphoinositol 
polyphosphate phosphohydrolase, gamma-glutamyl carboxylase, small protein 
5 conjugating enzyme, small protein activating enzyme, l-deoxyxylulose-5- 
phosphate synthase, 2-phosphotransferase, 2-octoprenyl-3-methyl-6-methoxy- 
1,4-benzoquinone hydroxylase, 2C-methyl-D-erythritol 2,4-cyclodiphosphate 
synthase, 3,4 dihydroxy-2-butanone-4-phosphate synthase, 4-amino-4- 
deoxychorismate lyase, 4-diphosphocytidyl-2C-methyl-D-erythritol synthase, 

10 ADP-L-glycero-D-manno-heptose synthase, D-erythro-7,8-dihydroneopterin 
triphosphate 2'-epimerase, N-ethylmaleirnide reductase, Oantigen Iigase, O 
antigen polymerase, UDP-2,3-diacylglucosamine hydrolase, arsenate reductase, 
carnitine racemase, cobalamin [5 '-phosphate] synthase , cobinamide phosphate 
guanylyltransferase, enterobactin synthetase, enterochelin esterase, 

15 enterochelin synthetase, glycolate oxidase, integrase, lauroyl transferase, 
peptidoglycan synthetase, phosphopantetheinyltransferase, 

phosphoglucosamine mutase, phosphoheptose isomerase, quinolinate 
synthase, siroheme synthase, N-acylmannosamine-6-phosphate 2-epimerase, 
N-acetyl-anhydromuramoyl-L-alanine amidase, carbon-phosphorous lyase, 

20 heme-copper terminal oxidase, disulfide oxidoreductase, phthalate dioxygenase 
reductase, sphingosine-1 -phosphate lyase, molybdopterin oxidoreductase, 
dehydrogenase, NADPH oxidase, naringenin-chalcone synthase, N- 
ethylammeline chlorohydrolase, polyketide synthase, aldolase, kinase, 
phosphatase, CoA-ligase, oxidoreductase, transferase, hydrolase, lyase 

25 isomerase, ligase, ATPase, sulfhydryl oxidase, lipoate-protein ligase, delta- 1- 
pyrroline-5-carboxyate synthetase, lipoic acid synthase and tRNA 
dihydrouridine synthase can be used to diagnose/treat diseases which can be 
ameliorated by modulating the activity of various enzymes which are involved 
both in enzymatic processes inside cells as well as in cell signaling. 

30 
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Protein serine/threonine kinases 

Information derived from kinases, which phosphorilate serine/threonine 
residues, mainly involved in signal transduction, such as transmembrane 
receptor protein serine/threonine kinase, 3-phosphoinositide-dependent protein 
5 kinase, DNA-dependent protein kinase, G-protein-coupled receptor 
phosphorylating protein kinase, SNFlA/AMP-activated protein kinase, casein 
kinase, calmodulin regulated protein kinase, cyclic-nucleotide dependent 
protein kinase, cyclin-dependent protein kinase, eukaryotic translation initiation 
factor 2alpha kinase, galactosyltransferase-associated kinase, glycogen synthase 
10 kinase 3, protein kinase C, receptor signaling protein serine/threonine kinase, 
ribosomal protein S6 kinase and IkB kinase can be used to treat, or detect, 
respectively, diseases which may be ameliorated by a modulating kinase 
activity, which is one of the main signaling pathways inside cell. 
Enzyme inhibitors 

15 Information derived from inhibitors and suppressors of other proteins 

and enzymes, such as inhibitors of Kinases, phosphatases, chaperones, 
guanylate cyclase, DNA gyrase, ribonuclease, proteasome inhibitors, diazepam- 
binding inhibitor, ornithine decarboxylase inhibitor GTPase inhibitors, dUTP 
pyrophosphatase inhibitor, phospholipase inhibitor, proteinase inhibitor, protein 

20 biosynthesis inhibitors, alpha-amylase inhibitors can be used to treat diseases in 
which beneficial effect may be achieved by modulating the activity of inhibitors 
and suppressors of proteins and enzymes. 
Signal transducers 

Information derived from various signal transducers, such as activin 
25 inhibitors, receptor-associated proteins alpha-2 macroglobulin receptors, 
morphogens, quorum sensing signal generators, quorum sensing response 
regulators, receptor signaling proteins, ligands, receptors, two-component 
sensor molecules, two-component response regulators can be used to 
diagnose/treat diseases involving abnormal signal-transduction, either as a 
30 cause, or as a result of the disease. 



BNSDOCID: <WO„ _ .03046220A1..L> 



WO 03/046220 PCT/IL02/00904 

77 

Receptors 

Information derived from various receptors, such as signal transducers, 
complement receptors, ligand-dependent nuclear receptors, transmembrane 
receptors, GPI-anchored membrane-bound receptors, various coreceptors, 
5 internalization receptors, receptors to neurotransmitters, hormones and various 
other effectors and ligands can be used to diagnose/treat diseases involving 
various receptors, including receptors to neurotransmitters, hormones and 
various other effectors and ligands. 
Receptor signaling proteins 

10 Information derived from receptor proteins involved in signal 

transduction, such as receptor signaling protein serine/threonine kinase, 
receptor signaling protein tyrosine kinase, receptor signaling protein tyrosine 
phosphatase, aryl hydrocarbon receptor nuclear translocator, 
hematopoeitin/interferon-class . (D200-domain) cytokine receptor signal 

15 transducer, transmembrane receptor protein tyrosine kinase signaling protein, 
transmembrane receptor protein serine/threonine kinase signaling protein, 
receptor signaling protein serine/threonine kinase signaling protein, receptor 
signaling protein serine/threonine phosphatase signaling protein, small GTPase 
regulatory/interacting protein, receptor signaling protein tyrosine kinase 

20 signaling protein, and receptor signaling protein serine/threonine phosphatase 
can be used to diagnose/treat diseases involving non-normal signal 
transduction, either as a cause, or as a result of the disease. 
Small GTPase regulatory/interacting proteins 

Information derived from small GTPase regulatory proteins, such as 
25 RAB escort protein, guanyl-nucleotide exchange factor, guanyl-nucleotide 
exchange factor adaptor, GDP-dissociation inhibitor, GTPase inhibitor, GTPase 
activator, guanyl-nucleotide releasing factor , GDP-dissociation stimulator, 
regulator of G-protein signaling, RAS interactor, RHO interactor, RAB 
interactor, and RAL interactor can be used to diagnose/treat diseases involving 



BNSDOCID: <WO 03046220A1„I..> 



WO 03/04622(1 PCT/IL02/00904 

78 

signal-transduction, typically involving G-proteases is non-normal, either as a 
cause, or as a result of the disease. 
Ligands 

Information derived from ligands such as opioid peptides, baboon 
5 receptor ligand, branchless receptor ligand, breathless receptor ligand, ephrin, 
frizzled receptor ligand, frizzled-2 receptor ligand, heartless receptor ligand, 
Notch receptor ligand, patched receptor ligand, punt receptor ligand, Ror 
receptor ligand, saxophone receptor ligand, SE20 receptor ligand, sevenless 
receptor ligand, smooth receptor ligand, thickveins receptor ligand, Toll 
10 receptor ligand, Torso receptor ligand, death receptor ligand, scavenger 
receptor ligand, neuroligin, integrin ligand, hormones, pheromones, growth 
factors and sulfonylurea receptor ligand can be used to diagnose/treat: 

(a) diseases involving non-normal secretion of proteins, which may 
be due to non-normal presence, absence or non-normal response to normal 

15 levels of secreted proteins including hormones, neurotransmitters, and various 
other proteins secreted by cells to the extracellular environment; 

(b) diseases which are endocrine in essence (cause or are a result of 
hormones), or may be ameliorated by raising, or decreasing the level of 
hormones and proteins; 

20 (c) diseases which may be ameliorated by modulating the 

concentration or activity or interaction binding, etc, of growth factors, 
cytokines, interleukins, interferon and lymphokines, typically diseases such as 
autoimmune diseases, inflammation related disease, Graft vs. Host diseases, 
diseases caused by infectious agents, cancer diseases, as well as disease 

25 originating from improper concentration of growth factors causing non-normal 
(either excessive or too little of) growth of various tissues themselves, or 
causing untimely death of a desired cell population; and 

(d) diseases which are manifested by non-normal development, 
which may be non-normal development of the organism (genetic diseases 
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involving non-normal development of a fetus), non-normal development of a 
tissue (a tissue which is not properly developed) as well as cancer diseases. 
Cell adhesion molecules 

Information derived from proteins that serve as adhesion molecules 
5 between adjoining cells, such as membrane-associated protein with guanylate 
kinase activity, cell adhesion receptor, neuroligin, calcium-dependent cell 
adhesion molecule, selectin, calcium-independent cell adhesion molecule, 
extracellular matrix protein can be used to diagnose/treat diseases where 
adhesion between adjoining cells is involved, typically conditions in which the 

jo adhesion is non-normal. Typical examples of such conditions are cancer 
conditions in which non-normal adhesion may cause and enhance the process of 
metastasis. Other examples of such conditions include conditions of non- 
normal growth and development of various tissues in which modulation 
adhesion among adjoining cells can improve the condition. 

1 5 Structural proteins 

Information derived from proteins involved in cell structure, such as 
ribosomal proteins, cell wall proteins, cytoskeletal proteins, extracellular matrix 
proteins, extracellular matrix glycoproteins, amyloid proteins, plasma proteins, 
eye lens proteins, chorion proteins (sensu Insecta), cuticle proteins (sensu 

20 Insecta), puparial glue protein (sensu Diptera), bone proteins, yolk proteins, 
muscle proteins, vitelline membrane proteins (sensu Insecta), peritrophic 
membrane proteins (sensu Insecta), and nuclear pore proteins can be used to 
diagnose/treat diseases involving abnormalities in cytoskeleton, including 
cancerous cells, and diseased cells including those which do not propagate, 

25 grow or function normally. Diseases involving non-normal sub-cellular 
proteins such as non-normal ribozymal proteins. 
Transporter proteins 

Information derived from proteins such as amine/polyamine transporter, 
lipid transporter, neurotransmitter transporter, organic acid transporter, oxygen 
30 transporter, water transporter, carriers, intracellular transportes, protein 
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transporters, ion transporters, carbohydrate transporter, polyol transporter, 
amino acid transporters, vitamin/cofactor transporters, siderophore 
transporter, drug transporter, channel/pore class transporter, group 
translocator, auxiliary transport proteins, Permeases, murein transporter, 
5 organic alcohol transporter, nucleobase, nucleoside and nucleotide and nucleic 
acid transporters can be used to diagnose/treat diseases in which abnormal 
transport of molecules and macromolecules such as neurotransmitters, 
hormones, sugar etc, leads to various pathologies. 
Intracellular transporters 

10 Information derived from proteins that mediate the transport of 

molecules and macromoleules inside the cell, such as intracellular nucleoside 
transporter, vacuolar assembly proteins, vesicle transporters, vesicle fusion 
proteins, and type II protein secretors can be used to diagnose/treat diseases in 
which abnormal transport of molecules and macromolecules leads to various 

15 pathologies. 

Ligand binding or carrier proteins 

Information derived from various proteins, involved in diverse 
biological functions, such as pyridoxal phosphate binding, carbohydrate 
binding, magnesium binding, amino acid binding, cyclosporin A binding, nickel 

20 binding, chlorophyll binding, biotin binding, penicillin binding, selenium 
binding, tocopherol binding, lipid binding, drug binding, oxygen transporter, 
electron transporter, steroid binding, juvenile hormone binding, retinoid 
binding, heavy metal binding, calcium binding, protein binding, 
glycosaminoglycan binding, folate binding, odorant binding, lipopolysaccharide 

25 binding, and nucleotide binding can be used to diagnose/treat diseases 
involving improper intracellular or extracellular accumulation or removal of 
small molecules such as calcium ions, improper incorporation of metals and 
modified amino acids (i.e., seleno-cystein), dysregulated signaling effected by 
improper steroid titration etc. 

30 
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Electron transporters 

Information derived from ligand binding proteins or carrier proteins 
involved in electron transport, such as flavin-containing electron transporter, 
cytochromes, electron donors, electron acceptors, electron carriers and 
5 cytochrome-c oxidases can be used to diagnose/treat diseases involving 
dysregulated mitochondrial activity. 

Calcium binding proteins 

Information derived from calcium binding proteins, ligand binding 
proteins or carriers, such as diacylglycerol kinase, Calpain, calcium-dependent 
io protein serine/threonine phosphatase, calcium sensing proteins and calcium 
storage proteins can be used to diagnose/treat diseases in which intracellular or 
extracellular calcium storage or release is improper. 

Binding proteins 

Information derived from various proteins exhibiting intermediate 

15 filament binding, LIM-domain binding, LLR-domain binding, clathrin binding, 
ARF binding, vinculin binding, KU70 binding, troponin C binding PDZ- 
domain binding, SH3-domain binding, fibroblast growth factor binding, 
membrane-associated protein with guanylate kinase activity interacting, Wnt- 
protein binding, DEAD/H-box RNA helicase binding, beta-amyloid binding, 

20 myosin binding, TATA-binding protein binding DNA topoisomerase I binding, 
polypeptide hormone binding, RHO binding, FH1 -domain binding, syntaxin-1 
binding, HSC70-interacting, transcription factor binding, metarhodopsin 
binding, tubulin binding, JUN kinase binding, RAN protein binding, protein 
signal sequence binding, importin alpha export receptor, poly-glutamine tract 

25 binding, protein carrier, beta-catenin binding, protein C-terminus binding, 
lipoprotein binding, cytoskeletal protein binding protein, nuclear localization 
sequence binding, protein phosphatase 1 binding, adenylate cyclase binding, 
eukaryotic initiation factor 4E binding, calmodulin binding, collagen binding, 
insulin-like growth factor binding, lamin binding, profilin binding, tropomyosin 

30 binding, actin binding, peroxisome targeting sequence binding, SNARE binding 
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and cyclin binding can be used to diagnose/treat diseases involving non-normal 
protein activity or structure. Binding of the products of the variants of this 
family, or antibodies reactive therewith, can modulate a plurality of protein 
activities as well as change protein structure. 
Transcription factor binding proteins 

Information derived from proteins involved in transcription factors 
binding, RNA and DNA binding, such as transcription factors, RNA and DNA 
binding proteins, zinc fingers, helicase, isomerase, histones, and nucleases can 
be used to diagnose/treat diseases involving transcription factors binding 
proteins, for example diseases where there is abnormal replication or 
transcription of DNA and RNA respectively. 

Enzyme regulators 

Information derived from enzyme regulators, such as activators of 
kinases, phosphatases, sphingolipids, chaperones, guanylate cyclase, 
tryptophan hydroxylase, proteases, phospholipases, caspases, proprotein 
convertase 2 activator, cyclin-dependent protein kinase 5 activator, superoxide- 
generating NADPH oxidase activator, sphingomyelin phosphodiesterase 
activator, monophenol monooxygenase activator, proteasome activator, and 
GTPase activator can be used to diagnose/treat diseases in which beneficial 
effect may be achieved by modulating the activity of activators of proteins and 
enzymes. 

Cell growth and/or maintenance proteins 

Information derived from proteins involved in any biological process 
required for cell survival, growth and maintenance including proteins involved 
in cell organization and biogenesis, cell growth, ceil proliferation, metabolism, 
cell cycle, budding, cell shape and cell size control, sporulation (sensu 
Saccharomyces), transport, ion homeostasis, autophagy, cell motility, chemi- 
mechanical coupling, membrane fusion, cell-cell fusion and stress response can 
be used to diagnose/treat diseases involving premature death of cells, such as 
degenerative diseases, for example neurodegenerative diseases or conditions 
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associated with aging, or alternatively, diseases in which cell apoptosis is not 
turned on, such as cancerous diseases. 
Metabolic proteins 

Information derived from proteins involved in carbohydrate metabolism, 
energy pathways, electron transport, nucleobase, nucleoside, nucleotide and 
nucleic acid metabolism, protein metabolism and modification, amino acid and 
derivative metabolism, protein targeting, lipid metabolism, aromatic compound 
metabolism, one-carbon compound metabolism, coenzymes and prosthetic 
group metabolism, sulfur metabolism, phosphorus metabolism, phosphate 
metabolism, oxygen and radical metabolism, xenobiotic metabolism, nitrogen 
metabolism, fat body metabolism (sensu Insecta), protein localization, 
catabolism, biosynthesis, toxin metabolism , methylglyoxal metabolism, 
cyanate metabolism, glycolate metabolism, carbon utilization, and antibiotic 
metabolism can be used to treat or detect diseases in which metabolism of small 
molecules and macromolecules such as toxins, lipids, proteins and 
carbohydrates is abnormal leading to various pathologies. 

Channel/pore class transporters 

Information derived from proteins that mediate the transport of 
molecules and macromoleules across membranes, such as alpha-type channels, 
porins and pore- forming toxins can be used to diagnose/treat diseases in which 
the transport of molecules and macromolecules such as neurotransmitters, 
hormones, sugar etc. is non-normal leading to various pathologies. 

Tubulin binding proteins 

Information derived from proteins that bind tubulin, such as microtubule 
binding proteins can be used to diagnose/treat diseases involving abnormal 
tubulin activity or structure. Binding of the RNA products of the genes of this 
family, or antibodies reactive therewith, can modulate a plurality of tubulin 
activities as well as change microtubulin structure. 
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Kinases 

Information derived from kinases such as 2-amino-4-hydroxy-6~ 
hydroxymethyldihydropteridine pyrophosphokinase, NAD(+) kinase, 
acetylglutamate kinase, adenosine kinase, adenylate kinase, adenylsulfate 
kinase, arginine kinase, aspartate kinase, choline kinase, creatine kinase, 
cytidylate kinase, deoxyadenosine kinase, deoxycytidine kinase, 
deoxyguanosine kinase, dephospho-CoA kinase, diacylglycerol kinase, dolichol 
kinase, ethanolamine kinase, galactokinase, glucokinase, glutamate 5-kinase, 
glycerol kinase, glycerone kinase, guanylate kinase, hexokinase, homoserine 
kinase, hydroxyethylthiazole kinase, inositol/phosphatidylinositol kinase, 
ketohexokinase, mevalonate kinase, nucleoside-diphosphate kinase, 
pantothenate kinase, phosphoenolpyruvate carboxykinase, phosphoglycerate 
kinase, phosphomevalonate kinase, protein kinase, pyruvate dehydrogenase 
(lipoamide) kinase, pyruvate kinase, ribokinase, ribose-phosphate 
pyrophosphokinase, selenide,water dikinase, shikimate kinase, thiamine 
pyrophosphokinase, thymidine kinase, thymidylate kinase, uridine kinase, 
xylulokinase, lD-myo-inositoI-trisphosphate 3-kinase, phosphofructokinase, 
pyridoxal kinase, sphinganine kinase, riboflavin kinase, 2-dehydro-3- 
deoxygalactonokinase, 2-dehydro-3-deoxygluconokinase, 4-diphosphocytidyl- 
2C-methyl-D-erythritol kinase, GTP pyrophosphokinase, L-fiiculokinase, L- 
ribulokinase, L-xylulokinase, isocitrate dehydrogenase (NADP+)] kinase, 
acetate kinase, allose kinase, carbamate kinase, cobinamide kinase, 
diphosphate-purine nucleoside kinase, fructokinase, glycerate kinase, 
hydroxymethylpyrimidine kinase, hygromycin-B kinase, inosine kinase, 
kanamycin kinase, phosphomethylpyrimidine kinase, phosphoribulokinase, 
polyphosphate kinase, propionate kinase, pyruvate,water dikinase, 
rhamnulokinase, tagatose-6-phosphate kinase, tetraacyldisaccharide 4^kinase, 
thiamine-phosphate kinase, undecaprenol kinase, uridylate kinase, N- 
acylmannosamine kinase and D-erythro-sphingosine kinase can be used to 
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diagnose/treat diseases, which may be ameliorated by a modulating kinase 
activity, which is one of the main signaling pathways inside cells. 
Oxidoreductases 

Information derived from enzymes that catalyze an oxidation-reduction 
reaction, including oxidoreductases acting on CH-OH, CH-CH, CH-NH2, CH- 
NH, NADH or NADPH, nitrogenous compounds, sulfur group of donors, heme 
group, hydrogen group, diphenols and related substances as donors, 
oxidoreductases acting on peroxide as acceptor, superoxide radicals as 
acceptor, oxidizing metal ions, CH2 groups, reduced ferredoxin donor, reduced 
flavodoxin donor, and aldehyde or oxo group of donors can be used to 
diagnose/treat diseases involving non-normal activity of oxidoreductases. 

Transferases 

Information derived from enzymes that catalyze the transfer of a 
chemical group, such as a phosphate or amine, from one molecule to another 
including transferases, transferring one-carbon groups, aldehyde or ketonic 
groups, acyl groups, glycosyl groups, alkyl or aryl (other than methyl) groups, 
nitrogenous, phosphorus-containing groups, sulfur-containing groups and 
lipoyltransferase, deoxycytidyl transferases can be used to diagnose/treat 
diseases in which the transfer of a chemical group from one molecule to 
another is abnormal and a beneficial effect may be achieved by modulation of 
such abnormal reactions. 

Transferases - one-carbon group 

Information derived from enzymes that catalyze the transfer of a single 
carbon from one molecule to another including methyltransferase, 
amidinotransferase, hydroxymethyl-, formyl- and related transferase, carboxyl- 
and carbamoyltransferase can be used to diagnose/treat diseases in which the 
transfer of a one-carbon chemical group from one molecule to another is 
abnormal and a beneficial effect may be achieved by modulation of such an 
abnormal reaction. 
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Transferases - glycosyl groups 

Information derived from enzymes that catalyze the transfer of a 
glycosyl from one molecule to another including murein lytic 
endotransglycosylase E and sialyltransferase can be used to diagnose/treat 
5 diseases in which the transfer of a glycosyl chemical group from one molecule 
to another is abnormal and a beneficial effect may be achieved by modulation 
of such an abnormal reaction. 

Transferases - phosphorus-containing groups 

Information derived from enzymes that catalyze the transfer of 
io phosphate from one molecule to another can be used to diagnose/treat diseases 
in which the transfer of a phosphate group to a modulated moiety is abnormal 
and a beneficial effect may be achieved by modulation of such abnormal 
transfer. 

Hydrolases 

15 Information derived from hydrolytic enzymes acting on ester bonds, 

glycosyl bonds, ether bonds, carbon-nitrogen (but not peptide) bonds, acid 
anhydrides, acid carbon-carbon bonds, acid halide bonds, acid phosphorus- 
nitrogen bonds, acid sulfur-nitrogen bonds, acid carbon-phosphorus bonds and 
acid sulfur-sulfur bonds can be used to diagnose/treat diseases in which the 

20 hydrolytic cleavage of a covalent bond with accompanying addition of water, - 
H being added to one product of the cleavage and -OH to the other, is abnormal 
and a beneficial effect may be achieved by modulation of such an abnormal 
reaction. 

Hydrolases, acting on ester bonds 

25 Information derived from hydrolytic enzymes, acting on ester bonds, 

such as nucleases, sulfuric ester hydrolase, carboxylic ester hydrolases, 
thiolester hydrolase, phosphoric monoester hydrolase, phosphoric diester 
hydrolase, triphosphoric monoester hydrolase, diphosphoric monoester 
hydrolase and phosphoric triester hydrolase can be used to diagnose/treat 

30 diseases in which the hydrolytic cleavage of a covalent bond with 
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accompanying addition of water, -H being added to one product of the cleavage 
and -OH to the other, is abnormal and a beneficial effect may be achieved by 
modulation of such an abnormalreaction, 
Carboxylic ester hydrolases 
5 Information derived from hydrolytic enzymes, acting on carboxylic ester 

bonds, such as N-acetylglucosaminylphosphatidylinositol deacetylase, 2-acetyl- 
1 -alkylglycerophosphocholine esterase, aminoacyl-tRNA hydrolase, 
arylesterase, carboxylesterase, cholinesterase, gluconolactonase, sterol 
esterase, acetylesterase, carboxymethylenebutenolidase, protein-glutamate 
10 methylesterase, and lipase, 6-phosphogluconolactonase can be used to 
diagnose/treat diseases which the hydrolytic cleavage of a covalent bond with 
accompanying addition of water, -H being added to one product of the cleavage 
and -OH to the other, is abnormal and a beneficial effect may be achieved by 
modulation of such an abnormal reaction. 
1 5 Phosphoric monoester hydrolases 

Information derived from hydrolytic enzymes acting on ester bonds, such 
as nuclease, sulfuric ester hydrolase, carboxylic ester hydrolase, thiolester 
hydrolase, phosphoric monoester hydrolase, phosphoric diester hydrolase, 
triphosphoric monoester hydrolase, diphosphoric monoester hydrolase and 
20 phosphoric triester hydrolase can be used to diagnose/treat diseases in which 
the hydrolytic cleavage of a covalent bond with accompanying addition of 
water, -H being added to one product of the cleavage and -OH to the other, is 
abnormal and a beneficial effect may be achieved by modulation of such an 
abnormal reaction. 
25 Hydrolases acting on glycosyl bonds 

Information derived from hydrolytic enzymes that act on glycosyl bonds, 
such as hydrolases hydrolyzing N-glycosyl compounds and S~glycosyl 
compounds, O-glycosyl compounds can be used to diagnose/treat diseases in 
which the hydrolase-related activities are abnormal. 

30 
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Hydrolases acting on acid anhydrides 

Information derived from hydrolytic enzymes which act on acid 
anhydrides, such as phosphorus-containing anhydrides, sulfonyl-containing 
anhydrides, and hydrolases catalysing transmembrane movement of substances, 
5 and involved in cellular and subcellular movement can be used to 
diagnose/treat diseases in which the hydrolase-related activities are abnormal. 

Lyases 

Information derived from enzymes that catalyze the formation of double 
bonds by removing chemical groups from a substrate without hydrolysis or 
10 catalyze the addition of chemical groups to double bonds including carbon- 
carbon lyases, carbon-oxygen lyases, carbon-nitrogen lyases, carbon-sulfur 
lyases, carbon-halide lyases, phosphorus-oxygen lyases, and other lyases can 
be used to diagnose/treat diseases in which lyase activity, expression or 
localization is abnormal. 
15 Ligases 

Information derived from enzymes that catalyze the linkage of two 
molecules, generally utilizing ATP as the energy donor can be used to 
diagnose/treat diseases in which the joining together of two molecules in an 
energy-dependent process is abnormal and a beneficial effect may be achieved 
20 by modulation of such an abnormal reaction. 

Ligases catalyzing carbon-oxygen bonds 

Information derived from enzymes that catalyze the linkage between 
carbon and oxygen, such as ligase forming aminoacyl-tRNA and related 
compounds can be used to diagnose/treat diseases in which the linkage between 
25 carbon and oxygen in an energy-dependent process is abnormal and a beneficial 
effect may be achieved by modulation of such an abnormal reaction. 
ATP ases 

Information derived from enzymes such as plasma membrane cation- 
transporting ATPase, ATP-binding cassette (ABC) transporter, magnesium- 
30 ATPase, hydrogen-/sodium-translocating ATPase, arsenite-transporting 
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ATPase, protein-transporting ATPase, DNA translocase, and P-type ATPase 
can be used to diagnose/treat diseases associated with abnormal activity of an 
ATP hydrolyzing enzyme- 
it Diseases 

Various types of diseases can be diagnosed/treated using the teachings of 
the present invention. 

Inflammatory diseases 

Examples of inflamatory diseases Include, but are not limited to, 
chronic inflammatory diseases and acute inflammatory diseases. 

Inflammatory diseases associated with hypersensitivity 

Examples of hypersensitivity include, but are not limited to, Types I-IV 
hypersensitivity, immediate hypersensitivity, antibody mediated 
hypersensitivity, immune complex mediated hypersensitivity, T lymphocyte 
mediated hypersensitivity and DTH. 

An example of type I or immediate hypersensitivity is asthma. Examples 
of type II hypersensitivity include, but are not limited to, rheumatoid diseases, 
rheumatoid autoimmune diseases, rheumatoid arthritis (Krenn V. et al, Histol 
Histopathol 2000 Jul;15 (3):791), spondylitis, ankylosing spondylitis (Jan 
Voswinkel et al, Arthritis Res 2001; 3 (3): 189), systemic diseases, systemic 
autoimmune diseases, systemic lupus erythematosus (Erikson J. et al, Immunol 
Res 1998; 17 (l-2):49), sclerosis, systemic sclerosis (Renaudineau Y. et al, Clin 
Diagn Lab Immunol. 1999 Mar;6 (2):156); Chan OT. et al, Immunol Rev 1999 
Jun;169:107), glandular diseases, glandular autoimmune diseases, pancreatic 
autoimmune diseases, diabetes, Type I diabetes (Zimmet P. Diabetes Res Clin 
Pract 1996 Oct;34 Suppl:S125), thyroid diseases, autoimmune thyroid diseases, 
Graves' disease (Orgiazzi L Endocrinol Metab Clin North Am 2000 Jun;29 
(2):339), thyroiditis, spontaneous autoimmune thyroiditis (Braley-Mullen H. 
and Yu S, J Immunol 2000 Dec 15;165 (12):7262), Hashimoto's thyroiditis 
(Toyoda N. et al f Nippon Rinsho 1999 Aug;57 (8):1810), myxedema, 
idiopathic myxedema (Mitsuma T. Nippon Rinsho. 1999 Aug;57 (8): 1759); 
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autoimmune reproductive diseases, ovarian diseases, ovarian autoimmunity 
(Garza KM. et al, J Reprod Immunol 1998 Feb;37 (2):87), autoimmune anti- 
sperm infertility (Diekman AB. et ai, Am J Reprod Immunol. 2000 Mar;43 
(3):134), repeated fetal loss (Tincani A. et al, Lupus 1998;7 Suppl 2:S107-9), 
5 neurodegenerative diseases, neurological diseases, neurological autoimmune 
diseases, multiple sclerosis (Cross AH. et ai, J Neuroimmunol 2001 Jan 1;112 
(1-2):1), Alzheimer's disease (Oron L. et ai, J Neural Transm Suppl. 
1997;49:77), myasthenia gravis (Infante AJ. And Kraig E, Int Rev Immunol 
1999;I8 (l-2):83), motor neuropathies (Romberg AJ. J Clin Neurosci. 2000 

10 May;7 (3):19J), Guillain-Barre syndrome, neuropathies and autoimmune 
neuropathies (Kusunoki S. Am J Med Sci. 2000 Apr;319 (4):234), myasthenic 
diseases, Lambert-Eaton myasthenic syndrome (Takamori ML Am J Med Sci. 
2000 Apr;319 (4):204), paraneoplastic neurological diseases, cerebellar 
atrophy, paraneoplastic cerebellar atrophy, non-paraneoplastic stiff man 

15 syndrome, cerebellar atrophies, progressive cerebellar atrophies, encephalitis, 
Rasmussen's encephalitis, amyotrophic lateral sclerosis, Sydeham chorea, 
Gilles de la Tourette syndrome, polyendocrinopathies, autoimmune 
polyendocrinopathies (Antoine JC. and Honnorat J. Rev Neurol (Paris) 2000 
Jan; 156 (1):23); neuropathies, dysimmune neuropathies (Nobile-Orazio E. et 

20 ai, Electroencephalogr Clin Neurophysiol Suppl 1999;50:419); neuromyotonia, 
acquired neuromyotonia, arthrogryposis multiplex congenita (Vincent A. et al, 
Ann N Y Acad Sci. 1998 May 13;841:482), cardiovascular diseases, 
cardiovascular autoimmune diseases, atherosclerosis (Matsuura E. et ai, Lupus. 
1998;7 Suppl 2:S135), myocardial infarction (Vaarala O. Lupus. 1998;7 Suppl 

25 2:S132), thrombosis (Tincani A. et al, Lupus 1998;7 Suppl 2:S107-9), 
granulomatosis, Wegener's granulomatosis, arteritis, Takayasu's arteritis and 
Kawasaki syndrome (Praprotnik S. et al, Wien Klin Wochenschr 2000 Aug 
25;112 (15-16):660); anti-factor VIII autoimmune disease (Lacroix-Desmazes 
S. et ai, Semin Thromb Hemost.2000;26 (2): 157); vasculitises, necrotizing 

30 small vessel vasculitises, microscopic polyangiitis, Churg and Strauss 
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syndrome, glomerulonephritis, pauci-immune focal necrotizing 
glomerulonephritis, crescentic glomerulonephritis (Noel LH. Ann Med Interne 
(Paris). 2000 May;151 (3):178); antiphospholipid syndrome (FlamholzR. et al, 
J Clin Apheresis 1999;14 (4):171); heart failure, agonisMike beta-adrenoceptor 
5 antibodies in heart failure (Wallukat G. et al, Am J Cardiol. 1999 Jun 17;83 
(12A):75H), thrombocytopenic purpura (Moccia F. Ann Ital Med Int. 1999 
Apr-Jun;14 (2): 114); hemolytic anemia, autoimmune hemolytic anemia 
(Efremov DG. et al, Leuk Lymphoma 1998 Jan;28 (3-4):285), gastrointestinal 
diseases, autoimmune diseases of the gastrointestinal tract, intestinal diseases, 

io chronic inflammatory intestinal disease (Garcia Herola A. et al, Gastroenterol 
Hepatol. 2000 Jan;23 (1):16), celiac disease (Landau YE. and Shoenfeld Y. 
Harefiiah 2000 Jan 16; 138 (2): 122), autoimmune diseases of the musculature, 
myositis, autoimmune myositis, Sjogren's syndrome (Feist E. et al, Int Arch 
Allergy Immunol 2000 Sep;123 (1):92); smooth muscle autoimmune disease 

15 (Zauii D. et al, Biomed Pharmacother 1999 Jun;53 (5-6):234), hepatic diseases, 
hepatic autoimmune diseases, autoimmune hepatitis (Manns MP. J Hepatol 
2000 Aug;33 (2):326) and primary biliary cirrhosis (Strassburg CP. etal, Eur J 
Gastroenterol Hepatol. 1999 Jun; 11 (6):595). 

Examples of type IV or T cell mediated hypersensitivity, include, but are 

20 not limited to, rheumatoid diseases, rheumatoid arthritis (Tisch R, McDevitt 
HO. Proc Natl Acad Sci USA 1994 Jan 18;91 (2):437), systemic diseases, 
systemic autoimmune diseases, systemic lupus erythematosus (Datta SK., 
Lupus 1998;7 (9):591), glandular diseases, glandular autoimmune diseases, 
pancreatic diseases, pancreatic autoimmune diseases, Type 1 diabetes (Castano 

25 L. and Eisenbarth GS. Ann. Rev. Immunol. 8:647); thyroid diseases, 
autoimmune thyroid diseases, Graves' disease (Sakata S. et al, Mol Cell 
Endocrinol 1993 Mar;92 (1):77); ovarian diseases (Garza KM. et al, J Reprod 
Immunol 1998 Feb;37 (2):87), prostatitis, autoimmune prostatitis (Alexander 
RB. et al, Urology 1997 Dec;50 (6):893), polyglandular syndrome, 

30 autoimmune polyglandular syndrome, Type I autoimmune polyglandular 
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syndrome (Hara T. et aL, Blood. 1991 Mar 1;77 (5):1127), neurological 
diseases, autoimmune neurological diseases, multiple sclerosis, neuritis, optic 
neuritis (Soderstrom M. et aL, J Neurol Neurosurg Psychiatry 1994 May;57 
(5):544), myasthenia gravis (Oshima M. et aL, Eur J Immunol 1990 Dec;20 
5 (12):2563), stiff-man syndrome (Hiemstra HS. et aL, Proc Natl Acad Sci U S A 
2001 Mar 27;98 (7):3988), cardiovascular diseases, cardiac autoimmunity in 
Chagas* disease (Cunha-Neto E. et aL, J Clin Invest 1996 Oct 15;98 (8); 1709), 
autoimmune thrombocytopenic purpura (Semple JW. et aL, Blood 1996 May 
15;87 (10):4245), anti-helper T lymphocyte autoimmunity (Caporossi AP. et 

10 aL, Viral Immunol 1998;11 (1):9), hemolytic anemia (Sallah S. et aL, Aim 
Hematol 1997 Mar;74 (3):139), hepatic diseases, hepatic autoimmune diseases, 
hepatitis, chronic active hepatitis (Franco A. et aL, Clin Immunol 
Immunopathol 1990 Mar;54 (3):382), biliary cirrhosis, primary biliary cirrhosis 
(Jones DE. Clin Sci (Colch) 1996 Nov;91 (5):551), nephric diseases, nephric 

15 autoimmune diseases, nephritis, interstitial nephritis (Kelly CJ. J Am Soc 
Nephrol 1990 Aug;l (2): 140), connective tissue diseases, ear diseases, 
autoimmune connective tissue diseases, autoimmune ear disease (Yoo TJ* et aL, 
Cell Immunol 1994 Aug; 157 (1):249), disease of the inner ear (Gloddek B. et 
aL, Ann N Y Acad Sci 1997 Dec 29;830:266), skin diseases, cutaneous 

20 diseases, dermal diseases, bullous skin diseases, pemphigus vulgaris, bullous 
pemphigoid and pemphigus foliaceus. 

Examples of delayed type hypersensitivity include, but are not limited to, 
contact dermatitis and drug eruption. 
Autoimmune diseases 

25 Examples of autoimmune diseases include, but are not limited to, 

cardiovascular diseases, rheumatoid diseases, glandular diseases, 
gastrointestinal diseases, cutaneous diseases, hepatic diseases, neurological 
diseases, muscular diseases, nephric diseases, diseases related to reproduction, 
connective tissue diseases and systemic diseases. 
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Examples of autoimmune cardiovascular diseases include, but are not 
limited to atherosclerosis (Matsuura E. et al, Lupus, 1998;7 Suppl 2:S135), 
myocardial infarction (Vaarala O. Lupus. 1998;7 Suppl 2:S132), thrombosis 
(Tincani A. et al, Lupus 1998;7 Suppl 2:S107-9), Wegener's granulomatosis, 
5 Takayasu's arteritis, Kawasaki syndrome (Praprotnik S. et al, Wien Klin 
Wochenschr 2000 Aug 25;112 (15-16):660), anti-factor VIII autoimmune 
disease (Lacroix-Desmazes S. et al, Semin Thromb Hemost2000;26 (2): 157), 
necrotizing small vessel vasculitis, microscopic polyangiitis, Churg and Strauss 
syndrome, pauci-immune focal necrotizing and crescentic glomerulonephritis 

30 (Noel LH. Ann Med Interne (Paris), 2000 May;151 (3):178), antiphospholipid 
syndrome (Flamholz R. et al, J Clin Apheresis 1999;14 (4):171), antibody- 
induced heart failure (Wallukat G, et al, Am J CardioL 1999 Jun 17;83 
(12A):75H), thrombocytopenic purpura (Moccia R Ann Ital Med Int 1999 
Apr-Jun;14 (2):114; Semple JW. et al, Blood 1996 May 15;87 (10):4245), 

15 autoimmune hemolytic anemia (Efremov DG. et al, Leuk Lymphoma 1998 
Jan;28 (3-4):285; Saliah S. et al, Ann Hematol 1997 Mar;74 (3):139), cardiac 
autoimmunity in Chagas' disease (Cunha-Neto E. et al, J Clin Invest 1996 Oct 
15;98 (8): 1709) and anti-helper T lymphocyte autoimmunity (Caporossi AP. et 
al, Viral Immunol 1998;11 (1):9). 

20 Examples of autoimmune rheumatoid diseases include, but are not 

limited to rheumatoid arthritis (Krenn V. et al, Histol Histopathol 2000 Jul; 15 
(3):791; Tisch R, McDevitt HO. Proc Natl Acad Sci units S A 1994 Jan 18;91 

(2) :437) and ankylosing spondylitis (Jan Voswinkel et al, Arthritis Res 2001; 3 

(3) : 189). 

25 Examples of autoimmune glandular diseases include, but are not limited 

to, pancreatic disease, Type I diabetes, thyroid disease, Graves' disease, 
thyroiditis, spontaneous autoimmune thyroiditis, Hashimoto's thyroiditis, 
idiopathic myxedema, ovarian autoimmunity, autoimmune anti-sperm 
infertility, autoimmune prostatitis and Type I autoimmune polyglandular 

30 syndrome, diseases include, but are not limited to autoimmune diseases of the 
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pancreas, Type 1 diabetes (Castano L. and Eisenbarth GS. Ann. Rev. Immunol 
8:647; Zimmet P. Diabetes Res Clin Pract 1996 Oct;34 Suppl:S125), 
autoimmune thyroid diseases, Graves' disease (Orgiazzi J. Endocrinol Metab 
Clin North Am 2000 Jun;29 (2):339; Sakata S. et al, Mol Cell Endocrinol 1993 
5 Mar;92 (1):77), spontaneous autoimmune thyroiditis (Braley-Mullen H. and Yu 
S, J Immunol 2000 Dec 15;165 (12);7262), Hashimoto's thyroiditis (ToyodaN. 
et al, Nippon Rinsho 1999 Aug;57 (8): 1810), idiopathic myxedema (Mitsuma 
T. Nippon Rinsho. 1999 Aug;57 (8):1759), ovarian autoimmunity (Garza KM. 
et al, J Reprod Immunol 1998 Feb;37 (2):87), autoimmune anti-sperm 

jo infertility (Diekman AB. et al, Am J Reprod Immunol. 2000 Mar;43 (3): 134), 
autoimmune prostatitis (Alexander RB. et al, Urology 1997 Dec;50 (6):893) 
and Type I autoimmune polyglandular syndrome (Hara T. et al, Blood. 1991 
Mar 1;77 (5): 1127). 

Examples of autoimmune gastrointestinal diseases include, but are not 

15 limited to, chronic inflammatory intestinal diseases (Garcia Herola A. et al, 
Gastroenterol Hepatol. 2000 Jan;23 (1):16), celiac disease (Landau YE* and 
Shoenfeld Y. Harefuah 2000 Jan 16;138 (2): 122), colitis, ileitis and Crohn's 
disease. 

Examples of autoimmune cutaneous diseases include, but are not limited 
20 to, autoimmune bullous skin diseases, such as, but are not limited to, pemphigus 
vulgaris, bullous pemphigoid and pemphigus foliaceus. 

Examples of autoimmune hepatic diseases include, but are not limited to, 
hepatitis, autoimmune chronic active hepatitis (Franco A. et al 9 Clin Immunol 
Immunopathol 1990 Mar;54 (3):382), primary biliary cirrhosis (Jones DE. Clin 
25 Sci (Colch) 1996 Nov;91 (5):551; Strassburg CP. et al, Eur J Gastroenterol 
Hepatol. 1999 Jun;l 1 (6):595) and autoimmune hepatitis (Manns MP. J Hepatol 
2000 Aug;33 (2):326). 

Examples of autoimmune neurological diseases include, but are not 
limited to, multiple sclerosis (Cross AH. et al, J Neuroimmunol 2001 Jan 1;1 12 
30 (1-2):1), Alzheimer's disease (Oron L. et al, J Neural Transm Suppl. 
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1997;49:77), myasthenia gravis (Infante AJ. And Kraig E, Int Rev Immunol 
1999;18 (l-2):83; Oshima M. et aL, Eur J Immunol 1990 Dec;20 (12):2S63), 
neuropathies, motor neuropathies (Kornberg AJ. J Clin Neurosci. 2000 May;7 
(3): 191); Guillain-Barre syndrome and autoimmune neuropathies (Kusunoki S. 
5 Am J Med Sci. 2000 Apr;319 (4):234), myasthenia, Lambert-Eaton myasthenic 
syndrome (Takamori M. Am J Med Sci. 2000 Apr;319 (4):204); paraneoplastic 
neurological diseases, cerebellar atrophy, paraneoplastic cerebellar atrophy and 
stiff-man syndrome (Hiemstra HS. et aL, Proc Natl Acad Sci units S A 2001 
Mar 27;98 (7):3988); non-paraneoplastic stiff man syndrome, progressive 

10 cerebellar atrophies, encephalitis, Rasmussen's encephalitis, amyotrophic 
lateral sclerosis, Sydeham chorea, Gilles de la Tourette syndrome and 
autoimmune polyendocrinopathies (Antoine JC. and Honnorat J. Rev Neurol 
(Paris) 2000 Jan;156 (1):23); dysimmune neuropathies (Nobile-Orazio E. et aL, 
Electroencephalogr Clin Neurophysiol SuppI 1999;50:419); acquired 

15 neuromyotonia, arthrogryposis multiplex congenita (Vincent A. et aL, Ann N Y 
Acad Sci. 1998 May 13;841:482), neuritis, optic neuritis (Soderstrom M. et aL, 
J Neurol Neurosurg Psychiatry 1994 May;57 (5):544) and neurodegenerative 
diseases. 

Examples of autoimmune muscular diseases include, but are not limited 
20 to, myositis, autoimmune myositis and primary Sjogren's syndrome (Feist E. et 
aL, Int Arch Allergy Immunol 2000 Sep; 123 (1):92) and smooth muscle 
autoimmune disease (Zauli D. et aL, Biomed Pharmacother 1999 Jun;53 (5- 
6):234). 

Examples of autoimmune nephric diseases include, but are not limited 
25 to, nephritis and autoimmune interstitial nephritis (Kelly CJ. J Am Soc Nephrol 
1990 Aug; 1 (2): 140). 

Examples of autoimmune diseases related to reproduction include, but 
are not limited to, repeated fetal loss (Tincani A. et aL, Lupus 1998;7 Suppl 
2:S 107-9). 
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Examples of autoimmune connective tissue diseases include, but are not 
limited to, ear diseases, autoimmune ear diseases (Yoo IX et al^ Cell Immunol 
1994 Aug; 157 (1):249) and autoimmune diseases of the inner ear (Gloddek B. 
et al., AnnNY Acad Sci 1997 Dec 29;830:266). 
5 Examples of autoimmune systemic diseases include, but are not limited 

to, systemic lupus erythematosus (Erikson J. et al, Immunol Res 1998; 17 (1- 
2):49) and systemic sclerosis (Renaudineau Y. et al, Clin Diagn Lab Immunol. 
1999 Mar;6 (2):156); Chan OT. etal t Immunol Rev 1999 Jun;169:107). 

Infectious diseases 

10 Examples of infectious diseases include, but are not limited to, chronic 

infectious diseases, subacute infectious diseases, acute infectious diseases, viral 
diseases, bacterial diseases, protozoan diseases, parasitic diseases, fungal 
diseases, mycoplasma diseases and prion diseases. 
Graft rejection diseases 

15 Examples of diseases associated with transplantation of a graft include, 

but are not limited to, graft rejection, chronic graft rejection, subacute graft 
rejection, hyperacute graft rejection, acute graft rejection and graft versus host 
disease. 

Allergic diseases 

20 Examples of allergic diseases include, but are not limited to, asthma, 

hives, urticaria, pollen allergy, dust mite allergy, venom allergy, cosmetics 
allergy, latex allergy, chemical allergy, drug allergy, insect bite allergy, animal 
dander allergy, stinging plant allergy, poison ivy allergy and food allergy. 
Cancerous diseases 

25 Examples of cancer include but are not limited to carcinoma, lymphoma, 

blastoma, sarcoma, and leukemia. Particular examples of cancerous diseases 
but are not limited to: Myeloid leukemia such as Chronic myelogenous 
leukemia. Acute myelogenous leukemia with maturation. Acute promyelocyte 
leukemia, Acute nonlymphocytic leukemia with increased basophils, Acute 

30 monocytic leukemia. Acute myelomonocytic leukemia with eosinophilia; 
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Malignant lymphoma, such as Birkitfs Non-Hodgkin f s; Lymphoctyic leukemia, 
such as Acute lumphoblastic leukemia. Chronic lymphocytic leukemia; 
Myeloproliferative diseases, such as Solid tumors Benign Meningioma, Mixed 
tumors of salivary gland, Colonic adenomas; Adenocarcinomas, such as Small 
5 cell lung cancer, Kidney, Uterus, Prostate, Bladder, Ovary, Colon, Sarcomas, 
Liposarcoma, myxoid. Synovial sarcoma, Rhabdomyosarcoma (alveolar), 
Extraskeletel myxoid chondrosarcoma, Ewing's tumor; other include 
Testicular and ovarian dysgerminoma, Retinoblastoma, Wilms' tumor, 
Neuroblastoma, Malignant melanoma, Mesothelioma, breast, skin, prostate, and 
io ovarian. 



EXAMPLE 9 

Microarray analysis based validation of the antisense dataset 
A microarray-based analysis using oligonucleotide probes that hybridize 
15 to the target in a strand-specific manner, was conducted in order to 
experimentally validate the predicted antisense/sense pairs of the database. Two 
complementary 60-mer oligonucleotide probes derived from the predicted 
overlap region of the sense/antisense pairs, were designed. Single 60-mer 
oligonucleotides were previously shown to offer reliability and sensitivity for 

20 detecting specific transcripts (T. R. Hughes, et aL, Nature Biotech. 19, 342 
(2001).).Initially only pairs of clusters with an overlap greater than 60 bases 
(2,464 pairs agree with this restriction) were selected for array construction. 
The overlap region of each antisense pair was then verified for the presence of 
60-mer oligonucleotides that matched a set of standards, such as minimal 

25 sequence similarity elsewhere in the human genome, uniform GC-content and 
Tm, and absence of palindromic sequences, in order to maximize the 
hybridization specificity. Oligonucleotide probes meeting the criteria set forth 
were identified for 1,211 sense/antisense pairs and a random sample of 264 
pairs, which constitutes roughly one-tenth of the original dataset of 2667 

30 sense/antisense cluster pairs, was selected for analysis by Microarrays (Table 



BNSDOCID: <WO ,03046220A1„L> 



WO 03/046220 PCT/IL02/00904 

98 

SI on CD-ROM3, an excerpt of which is shown in Table 5 below). In this 
sample, the proportion of each of the nine subgroups depicted in Table 4 is 
similar to that of the original dataset, indicating a good representation of the 
various subgroups. 
5 Table 4 



mRNA/ 


No cluster 


1 cluster 


2 clusters 


Total 


Splicing 


w introns 


w intron(s) 


w intron(s) 




No cluster w mRNA 


48 


132 


197 


377 (14%) 


1 cluster w mRNA 


17 


490 


1039 


1546 (58%) 


2 clusters w mRNA 


1 


85 


658 


744 (28%) 


Total 


66 (2.5%) 


707 (26%) 


1894 (71%) 


2667 (100%) 



Table represents the proportion of sense/antisense clusters in the dataset of 2667 that 
contain: 1 ) a known mRNA and 2) expressed sequences spanning at least one intron, in 
one of the two clusters, in both clusters or in none of the clusters. 



10 Table 5 below is an excerpt of Table SI provided on CD-ROM3; Table 

5 exemplifies five of the putative sense/antisense pairs that were selected for 
rnicroarray analysis. The first column provides the pair number. The next two 
columns provide the accession numbers of representative expressed sequences 
from the overlapping region of the sense and the antisense genes, respectively. 

15 The two columns identified by the "RNA" header provide the accession 
numbers of known mRNAs in the sense and antisense clusters (if available), 
and the last two columns provide the GenBank descriptions of these mRNAs. 



Table 5 



Pair 


sense seq. 


antisense 


RNA 


RNA 


description 


description 


no. 


from over- 


seq. from 


in 


in 


of RNA 


of RNA 




lapping 


overlapping 


sense 


a-sense 


tn sense 


in antisense 




region 


region 


cluster 


cluster 


cluster 


cluster 


235 


NM__ 


NM_ 


NM_ 


NM_ 


Homo sapiens 


Homo sapiens 




6227 


308 


6227 


308 


phospholipid 


protective protein for 












transfer protein 


beta-gal actosidase 












(PLTP), mRNA j 


(galactosiaiidosis) 












#DV L26232.1 


(PPGB), mRNA 
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237 




NM_ 




NM_ 


Homo sapiens 


Homo sapiens 




4703 


2532 


4703 


2532 


rabaptin-5 


nucleoporin 88kD 












(RAB5EP), mRNA 


(NUP88) mRNA 












#DV X91141.1 


#DV Y08612.2 


217 


NM_ 


AV 


NM_ 


NM_ 


Homo sapiens 


Homo sapiens ATP- 




14885 


723808 


14885 


2940 


anaphase-promoting 


binding cassette, 












complex 10 


sub-family E 












(APC10) mRNA. 


(OABP), member 1 












#DV AL080O90.1 


(ABCEl), mRNA 


209 


BC 


BG 


NM_ 


NM_ 


Homo sapiens 


Homo sapiens 




8865 


717574 


32231 


3099 


hypothetical protein 


sorting nexin 1 












FLJ22875 


(SNXl),mRNA. 












(FU22875), mRNA 


#DV U53225.1 


196 


BE 


AL 


NM_ 


NM_ 


Homo sapiens 


Homo sapiens 




885605 


527611 


17832 


3640 


hypothetical protein 


inhibitor of kappa 












FLJ20457 


light polypeptide gene 












(FU20457), mRNA 


enhancer in B-cells, 














kinase complex- 














associated protein 














(IKBKAP),mRNA 



Table 5 ConL 



Microarrays were constructed by spotting each of the 264 pairs of 
oligonucleotide probes onto treated glass slides in quadruplicates. The two 
5 counterpart oligonucleotide probes of each pair were spotted next to each other 
to ensure similar hybridization conditions. 

As positive controls, each of the blocks contained oligonucleotides 
spotted at various concentrations for four ubiquitously expressed housekeeping 
genes: guanine nucleotide binding protein beta polypeptide 2-like 1 (gnb211, 
io HUMMHBA123, NMJ)06098), heat shock 70kD protein 10 (hsp70, 
HSHSC70CDS0, NM_006597), beta actin (actin, ACTB, NMJKH101), and 
glyceraldehyde-3-phosphate dehydrogenase (gapdh, NM_002046). 

Two random oligonucleotides were used as negative controls. These 
computer-generated arbitrary sequences displayed no alignment to human 
is genome sequences but had the same physical characteristics as the other 
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oligonucleotide probes. In addition, 22 probes for 1 1 previously documented 
sense/antisense pairs were also analyzed in the Microarrays (entries Pair no. 
"known l"-"known 1 1" on Table Si of CD-ROM3). 

The Microarrays were hybridized with poly(A)+ RNAs obtained from 19 
5 human cell lines representing a variety of tissues and four normal human tissues 
(see General Materials and Methods section above). Each poly(A)+ RNA was 
reverse transcribed by priming with oligo(dT) and random nonamers, and 
engineered to incorporate a fluorescent marker. A pool containing an equal mix 
of the RNAs from all cell lines was also transcribed and used as a reference 

10 target. The resulting fluorescently-labeled cDNAs were combined and 
hybridized to the oligonucleotide Microarrays. 

The experiments were performed in duplicate and utilized a fluorescent 
reversal of the Cy3- and Cy5-labelled cDNA. Stringent hybridization conditions 
were utilized in order to minimize the appearance of false positive signals, 

15 despite the possibility of compromised detection of low abundance transcripts. 

The raw data was normalized at several levels; within each slide, 
between reciprocal slides, and globally between slides (see General Materials 
and Methods section above). Non-specific levels of hybridization were 
estimated from the negative controls. The threshold for significant positive 

20 signals resulting from authentic hybridization was set at 4 standard deviations 
of the mean normalized signals for the negative controls. Processed data was 
presented as normalized signal intensity and as normalized signal ratios (Table 
S2 on CD-ROM3). 

To further substantiate array results, several pairs of oligonucleotides 

25 were also utilized in Northern blot analysis. Figures 22a-j illustrate results of 
such northern blot analysis. Figure 22a reveals expression patterns of randomly 
selected sequence pair number 235, denoted as Rand_235 in Table 6. 
Similarly, Figure 22b corresponds to pair number 173, Figure 22c to pair 
number 248, Figure 22d to pair number 6, Figure 22e to pair number 216, 

30 Figure 22f to pair number 239, Figure 22g to pair number 202, Figure 22h to 
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pair number 114, Figure 22i to pair number 188, and Figure 22j to pair number 
223. Eight pairs (Figures 22a-h) evaluated revealed positive signals for both 
sense and antisense expression, while two (Figures 22i-j) revealed a positive 
signal for only one of the genes, with the counterpart being a known RefSeq 
5 mRNA. 

Figure 23 represents an excerpt of Table S2 (provided in CD-ROM3) 
which summarizes the results obtained utilizing the array generated according 
to the teachings of the present invention. Expression thresholds were verified 
and indicated and normalization for microarray signals was conducted as 
10 described above. Rji ratios were obtained for each cell line/tissue assessed. 

Taken cumulatively, the data presented herein revealed positive signals 
for both sense and antisense transcripts in 65 cluster pairs. In another 47 cases, 
significant hybridization signals were detected for antisense sequences with 
known counterpart sense transcripts, i.e. RefSeq mRNAs, which did not give 
15 clear hybridization signals on the Microarrays. Thus, 42.5 % (1 12 cases) of the 
264 represented on the Microarrays, yielded detectable antisense transcription. 

The conversion table, assigning the respective serial number as it 
appears in the "Table" file of CD-ROM 1 enclosed herewith, is shown in Table 
6 below. 

20 Table 6 



RandJT 


Serial No 


Randjf 


Serial No 


Rand J* 


Serial No 


RancM 


2326 


Rand J 79 


3266 


Rand_258 


3807 


RaiuMO 


3647 


Rand_18 


3073 


Rand_259 


2621 


Rand_100 


275S 


Randal 80 


1794 


Rand_26 


4009 


RandJOl 


1595 


Randal 81 


15S5 


Rand_27 


3393 


RandJ 02 


3686 


Rand J 82 


3554 


Rand_28 


3589 


Randal 03 


2331 


Rand J 83 


3377 
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Table6Cont 

Rand tt = the name of the pair on the chip as it appears in Table S2 on CD-ROM3, column 
"Probe"; Serial No ■ no of the pair in the Table on CD-ROM1 (could be more than one in case 
the antisense event was separated to more than two contigs). 



5 

The sensitivity of the experimental approach utilized, i.e. the ability to 
detect a given transcript, stems from a combination of the stringency used in the 
microarray analysis and the level of expression and tissue specificity of the 
RNA. This can be estimated from the positive signals obtained for 65% of the 
10 oligos representing known RefSeq mRNAs on the Microarrays. This level of 
detection is comparable to that obtained in other studies, such as the 58% of 
known exons verified using microarray analysis (D. D. Shoemaker, et ah, 
Nature 409, 922; 2001). 

Thus, the present methodology provides a level of detection for a pair of 
15 genes that is 0.65 x 0.65= 0.42, a value supported by the detection of positive 
signals for both sense and antisense expression in 5 out of 1 1 (0.45) clusters of 
previously described sense/antisense pairs (Table S2 on CD-ROM3). 

Of the 264 cluster pairs analyzed in the Microarrays of the present 
invention, 65 clusters (0.25) showed significant signals for both sense and 
20 antisense transcripts, which is 60% of the proposed level of detection for a pair 
of genes (0.25/0.42). Extrapolating this figure to the predicted antisense dataset 
of 2667 clusters, predicts at least 1600 sense/antisense transcriptional units in 
the human genome. 

Although the invention has been described in conjunction with specific 
25 embodiments thereof, it is evident that many alternatives, modifications and 
variations will be apparent to those skilled in the art. Accordingly, it is 
intended to embrace all such alternatives, modifications and variations that fall 
within the spirit and broad scope of the appended claims. All publications, 
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patents, patent applications and sequences identified by their accession numbers 
mentioned in this specification are herein incorporated in their entirety by 
reference into the specification, to the same extent as if each individual 
publication, patent, patent application or sequence identified by their accession 
5 number was specifically and individually indicated to be incorporated herein by 
reference. In addition, citation or identification of any reference in this 
application shall not be construed as an admission that such reference is 
available as prior art to the present invention. 
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CD-ROM Content 
The following CD-ROMs are attached herewith: 

Information provided as: File name/byte size/date of creation/operating 
system/machine format 

CD-ROM1: 

1 . seqs 327MB 1 5/1 1/200 1 Microsoft Windows Internet Explorer 

2. table 13.5MB 15/1 1/2001 Microsoft Windows Internet Explorer 



io CD-ROM2: 

1 . alignments 3 82MB 1 5/1 1/200 1 Microsoft Windows Internet Explorer 
CD-ROM3: 

1. Tablets 1 79.5kb 10/07/2002 Microsoft Windows Microsoft Excel 
15 Worksheet 

2. Table_S2 334kb 10/07/2002 Microsoft Windows Microsoft Excel 
Worksheet 
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1. A method of identifying putative naturally occurring antisense 
transcripts, the method comprising: 

(a) computationally aligning a first database including sense-oriented 
polynucleotide sequences with a second database including 
expressed polynucleotide sequences; and 

(b) identifying expressed polynucleotide sequences from said second 
database being capable of forming a duplex with at least one 
sense-oriented polynucleotide sequence of said first database, 
thereby identifying putative naturally occurring antisense 
transcripts. 

2. The method of claim 1, wherein said first database includes 
sequences of a type selected from the group consisting of genomic sequences, 
expressed sequence tags, contigs, intron sequences, complementary DNA 
(cDNA) sequences, pre-messenger RNA (mRNA) sequences and mRNA 
sequences. 

3. The method of claim 1, wherein said second database includes 
sequences of a type selected from the group consisting of expressed sequence 
tags, contigs, complementary DNA (cDNA) sequences, pre-messenger RNA 
(mRNA) sequences and mRNA sequences. 

4. The method of claim 1, wherein an average sequence length of 
said expressed polynucleotide sequences of said second database is selected 
from a range of 0.02 to 0.8 Kb. 
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5. The method of claim 1, wherein said second database is generated 

by: 

(i) providing a library of expressed polynucleotides; 

(ii) obtaining sequence information of said expressed 
polynucleotides; 

(iii) computationally selecting at least a portion of said expressed 
polynucleotides according to at least one sequence criterion; and 

(iv) storing said sequence information of said at least a portion of said 
expressed polynucleotides thereby generating said second 
database. 



6. The method of claim 5, wherein said at least one sequence 
criterion for computationally selecting said at least a portion of said expressed 
polynucleotide is selected from the group consisting of sequence length, 
sequence annotation, sequence information, intron splice consensus site, intron 
sharing, sequence overlap, rare restriction site, poly(T) head, poly(A) tail, and 
poly(A) signal. 



7. The method of claim 1 further comprising the step of testing the 
putative naturally occurring antisense transcripts for an ability to form said 
duplex with said at least one sense oriented polynucleotide sequence under 
physiological conditions. 



8 . The method of claim 1 further comprising the step of 
computationally testing the putative naturally occurring antisense transcripts 
according to at least one criterion selected from the group consisting of 
sequence annotation, sequence information, intron splice consensus site, intron 
sharing, sequence overlap, rare restriction site, poly(T) head, poly(A) tail, and 
poly(A) signaL 
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9. A kit for quantifying at least one mRNA transcript of interest, the 
kit comprising at least one oligonucleotide being designed and configured so as 
to be complementary to a sequence region of the mRNA transcript of interest, 
said sequence region not being complementary with a naturally occurring 
antisense transcript. 

10. The kit of claim 9, wherein a length of said at least one 
oligonucleotide is selected from a range of 15-200 nucleotides. 

1 1 • The kit of claim 9, wherein said at least one oligonucleotide is a 
single stranded oligonucleotide. 

12. The kit of claim 9, wherein said at least one oligonucleotide is a 
double stranded oligonucleotide. 

13. The kit of claim 9, wherein a guanidine and cytosine content of 
said at least one oligonucleotide is at least 25 %. 

14. The kit of claim 9, wherein said at least one oligonucleotide is 
labeled. 

15. The kit of claim 9, wherein said at least one oligonucleotide is 
attached to a solid substrate. 

16. The kit of claim 15, wherein said solid substrate is configured as 
a microarray and whereas said at least one oligonucleotide includes a plurality 
of oligonucleotides each attached to said microarray in a regio-specific manner. 

1 7. A kit for quantifying at least one mRNA transcript of interest, the 
kit comprising at least one pair of oligonucleotides including a first 
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oligonucleotide capable of binding the at least one mRNA transcript of interest 
and a second oligonucleotide being capable of binding a naturally occurring 
antisense transcript complementary to the mRNA of interest. 

18. The kit of claim 17, wherein a length of each of said first and 
second oligonucleotides is selected from a range of 15-200 nucleotides 

19. The kit of claim 17, wherein said first and second 
oligonucleotides are single stranded oligonucleotides. 

20. The kit of claim 17, wherein said first and second 
oligonucleotides are double stranded oligonucleotide. 

21. The kit of claim 17, wherein a guanidine and cytosine content of 
each of said first and second oligonucleotides is at least 25 %. 

22. The kit of claim 17, wherein said first and second 
oligonucleotides are labeled. 

23. The kit of claim 17, wherein said first and second 
oligonucleotides are attached to a solid substrate. 

24. The kit of claim 23, wherein said solid substrate is configured as 
a microarray and whereas each of said first and second oligonucleotides 
includes a plurality of oligonucleotides each attached to said microarray in a 
regio-specific manner. 

25. A kit for quantifying at least one naturally occurring antisense 
transcript of interest, the kit comprising at least one oligonucleotide being 
designed and configured so as to be complementary to a sequence region of the 
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at least one naturally occurring antisense transcript of interest, said sequence 
region not being complementary with a naturally occurring mRNA transcript. 

26. The kit of claim 25, wherein a length of said at least one 
oligonucleotide is selected from a range of 15-200 nucleotides. 

27. The kit of claim 25, wherein said at least one oligonucleotide is a 
single stranded oligonucleotide. 

28. The kit of claim 25, wherein said at least one oligonucleotide is a 
double stranded oligonucleotide. 

29. The kit of claim 25, wherein a guanidine and cytosine content of 
said at least one oligonucleotide is at least 25 %. 

30. The kit of claim 25, wherein said at least one oligonucleotide is 
labeled. 

31. The kit of claim 25, wherein said at least one oligonucleotide is 
attached to a solid substrate. 

32. The kit of claim 31, wherein said solid substrate is configured as 
a microarray and whereas said at least one oligonucleotide includes a plurality 
of oligonucleotides each attached to said microarray in a regio-specific manner. 

33. A method of designing artificial antisense transcripts, the method 
comprising: 

(a) providing a database of naturally occurring antisense transcripts; 

(b) extracting from said database criteria governing structure and/or 
function of said naturally occurring antisense transcripts; and 
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(c) designing the artificial antisense transcripts according to said 
criteria. 

34. The method of claim 33, wherein said criteria governing structure 
and/or function of said naturally occurring antisense transcripts are selected 
from the group consisting of antisense length, complementarity length, 
complementarity position, intron molecules, alternative splicing sites, tissue 
specificity, pathological abundance, chromosomal mapping, open reading 
frames, promoters, hairpin structures, helix structures, stem and loops, 
pseudoknots and tertiary interactions, guanidine and/or cytosine content, 
guanidine tandems, adenosine content, thermodynamic criteria, RNA duplex 
melting point, RNA modifications, protein-binding motifs, palindromic 
sequence and predicted single stranded and double stranded regions. 

35. The method of claim 33, wherein said step of providing said 
database of naturally occurring antisense transcripts is effected by: 

(a) computationally aligning a first database including sense-oriented 
polynucleotide sequences with a second database including 
expressed polynucleotide sequences; and 

(b) identifying expressed polynucleotide sequences from said second 
database being capable of forming a duplex with at least one 
sense-oriented polynucleotide sequence of said first database, 

(c) storing a sequence of said expressed polynucleotide sequences 
identified in step (b), thereby providing said database of said 
naturally occurring antisense transcripts.. 

36. The method of claim 35, wherein said first database includes 
sequences of a type selected from the group consisting of genomic sequences, 
expressed sequence tags, contigs, intron sequences, complementary DNA 
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(cDNA) sequences, pre-messenger RNA (mRNA) sequences and mRNA 
sequences. 

37. The method of claim 35, wherein said second database includes 
sequences of a type selected from the group consisting of expressed sequence 
tags, contigs, complementary DNA (cDNA) sequences, pre-messenger RNA 
(mRNA) sequences and mRNA sequences. 

38. The method of claim 35, wherein an average sequence length of 
said expressed polynucleotide sequences of said second database is selected 
from a range of 0.02 to 0.8 Kb. 



39. The method of claim 35, wherein said second database is 
generated by: 

(i) providing a library of expressed polynucleotides; 

(ii) obtaining sequence information of said expressed 
polynucleotides; 

(iii) computationally selecting at least a portion of said expressed 
polynucleotides according to at least one sequence criterion; and 

(iv) storing said sequence information of said at least a portion of said 
expressed polynucleotides thereby generating said second 
database. 



40. The method of claim 39, wherein said at least one sequence 
criterion for computationally selecting said at least a portion of said expressed 
polynucleotide is selected from the group consisting of sequence length, 
sequence annotation, sequence information, intron splice consensus site, intron 
sharing, sequence overlap, rare restriction site , poly(T) head, poly(A) tail, and 
poly(A) signal. 
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4L The method of claim 35, further comprising the step of testing 
said putative naturally occurring antisense transcripts for an ability to form said 
duplex with said at least one sense oriented polynucleotide sequence under 
physiological conditions. 

42. The method of claim 35 further comprising the step of 
computationally testing said putative naturally occurring antisense transcripts 
according to at least one criterion selected from the group consisting of 
sequence annotation, sequence information, intron splice consensus site, intron 
sharing, sequence overlap, rare restriction site , poly(T) head, poly(A) tail, and 
poly(A) signal. 

43. A computer readable storage medium comprising a database 
including a plurality of sequences, wherein each sequence is of a naturally 
occurring antisense transcript. 

44. The computer readable storage medium of claim 43, wherein said 
database further includes information pertaining to each sequence of said 
naturally occurring antisense transcripts, said information is selected from the 
group consisting of related sense gene, antisense length, complementarity 
length, complementarity position, intron molecules, alternative splicing sites, 
tissue specificity, pathological abundance, chromosomal mapping, open reading 
frames, promoters, hairpin structures, helix structures, stem and loops, 
pseudoknots and tertiary interactions, guanidine and/or cytosine content, 
guanidine tandems, adenosine content, thermodynamic criteria, RNA duplex 
melting point, RNA modifications, protein-binding motifs, palindromic 
sequence and predicted single stranded and double stranded regions. 
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45. The computer readable storage medium of claim 43, wherein said 
database further includes information pertaining to generation of said database 
and potential uses of said database. 

46. A method of generating a database of naturally occurring 
antisense transcripts, the method comprising: 

(a) computationally aligning a first database including sense-oriented 
polynucleotide sequences with a second database including 
expressed polynucleotide sequences; 

(b) identifying expressed polynucleotide sequences from said second 
database being capable of forming a duplex with at least one 
sense-oriented polynucleotide sequence of said first database so 
as to identify putative naturally occurring antisense transcripts; 
and 

(c) storing sequence information of said identified naturally 
occurring antisense transcripts, thereby generating the database of 
the naturally occurring antisense transcripts. 

47. The method of claim 46, wherein said first database includes 
sequences of a type selected from the group consisting of genomic sequences, . 
expressed sequence tags, contigs, intron sequences, complementary DNA 
(cDNA) sequences, pre-messenger RNA (mRNA) sequences and mRNA 
sequences. 

48. The method of claim 46, wherein said second database includes 
sequences of a type selected from the group consisting of expressed sequence 
tags, contigs, complementary DNA (cDNA) sequences, pre-messenger RNA 
(mRNA) sequences and mRNA sequences. 
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49. The method of claim 46, wherein an average sequence length of 
said expressed polynucleotide sequences of said second database is selected 
from a range of 0.02 to 0.8 Kb. 

50. The method of claim 46, wherein said second database is 
generated by: 

(i) providing a library of expressed polynucleotides; 

(ii) obtaining sequence information of said expressed 
polynucleotides; 

(iii) computationally selecting at least a portion of said expressed 
polynucleotides according to at least one sequence criterion; and 

(iv) storing said sequence information of said at least a portion of said 
expressed polynucleotides thereby generating said second 
database. 

51. The method of claim 50, wherein said at least one sequence 
criterion for computationally selecting said at least a portion of said expressed 
polynucleotide is selected from the group consisting of sequence length, 
sequence annotation, sequence information, intron splice consensus site, intron 
sharing, sequence overlap, rare restriction site , poly(T) head, poly(A) tail, and 
poly(A) signal. 

52. The method of claim 46 further comprising the step of testing the 
putative naturally occurring antisense transcripts for an ability to form said 
duplex with said at least one sense oriented polynucleotide sequence under 
physiological conditions. 

53. The method of claim 46 further comprising the step of computationally 
testing the putative naturally occurring antisense transcripts according to at least one 
criterion selected from the group consisting of sequence annotation, sequence 
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information, intron splice consensus site, intron sharing, sequence overlap, rare 
restriction site , poly(T) head, poly(A) tail, and poly(A) signal. 

54. A system for generating a database of a plurality of putative 
naturally occurring antisense transcripts, the system comprising a processing 
unit, said processing unit executing a software application configured for: 

(a) computationally aligning a first database including sense-oriented 
polynucleotide sequences with a second database including 
expressed polynucleotide sequences; and 

(b) identifying expressed polynucleotide sequences from said second 
database being capable of forming a duplex with at least one 
sense-oriented polynucleotide sequence of said first database. 

55. The system of claim 54, wherein said first database includes 
sequences of a type selected from the group consisting of genomic sequences, 
expressed sequence tags, contigs, intron sequences, complementary DNA 
(cDNA) sequences, pre-messenger RNA (mRNA) sequences and mRNA 
sequences. 

56. The system of claim 54, wherein said second database includes 
sequences of a type selected from the group consisting of expressed sequence 
tags, contigs, complementary DNA (cDNA) sequences, pre-messenger RNA 
(mRNA) sequences and mRNA sequences. 

57. The system of claim 54, wherein an average sequence length of 
said expressed polynucleotide sequences of said second database is selected 
from a range of 0.02 to 0.8 Kb. 
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58. The system of claim 54, wherein said second database is 
generated by: 

(i) providing a library of expressed polynucleotides; 

(ii) obtaining sequence information of said expressed 
polynucleotides; 

(iii) computationally selecting at least a portion of said expressed 
polynucleotides according to at least one sequence criterion; and 

(iv) storing said sequence information of said at least a portion of said 
expressed polynucleotides thereby generating said second 
database. 

59. The system of claim 58, wherein said at least one sequence 
criterion for computationally selecting said at least a portion of said expressed 
polynucleotide is selected from the group consisting of sequence length, 
sequence annotation, sequence information, intron splice consensus site, intron 
sharing, sequence overlap, rare restriction site , poly(T) head, poly(A) tail, and 
poly (A) signal. 

60. The system of claim 54 further comprising the step of testing the 
putative naturally occurring antisense transcripts for an ability to form said 
duplex with said at least one sense oriented polynucleotide sequence under 
physiological conditions. 

61. The system of claim 54 further comprising the step of 
computationally testing the putative naturally occurring antisense transcripts 
according to at least one criterion selected from the group consisting of 
sequence annotation, sequence information, intron splice consensus site, intron 
sharing, sequence overlap, rare restriction site , poly(T) head, poIy(A) tail, and 
poly(A) signal 
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62. A method of identifying putative naturally occurring antisense 
transcripts, the method comprising screening a database of expressed 
polynucleotides sequences according to at least one sequence criterion, said at 
least one sequence criterion being selected to identify putative naturally 
occurring antisense transcripts. 

63. The method of claim 63, wherein said database includes 
sequences of a type selected from the group consisting of expressed sequence 
tags, contigs, complementary DNA (cDNA) sequences, pre-messenger RNA 
(mRNA) sequences and mRNA sequences. 

64. The method of claim 63, wherein an average sequence length of 
said expressed polynucleotide sequences of said second database is selected 
from a range of 0.02 to 0.8 Kb. 

65. The method of claim 63, wherein said at least one sequence 
criterion is selected from the group consisting of sequence length, sequence 
annotation, sequence information, intron splice consensus site, intron sharing, 
sequence overlap, rare restriction site , poly(T) head, poly(A) tail, and poly(A) 
signal. 

66. The method of claim 63 further comprising the step of testing the 
putative naturally occurring antisense transcripts for an ability to form a duplex 
with at least one sense oriented polynucleotide sequence under physiological 
conditions. 

67. A method of quantifying at least one mRNA of interest in a 
biological sample, the method comprising: 

(a) contacting the biological sample with at least one oligonucleotide 
capable of binding with the at least one mRNA of interest, 
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wherein said at least one oligonucleotide is designed and 
configured so as to be complementary to a sequence region of the 
mRNA transcript of interest, said sequence region not being 
complementary with a naturally occurring antisense transcript; 
and 

(b) detecting a level of binding between the at least one mRNA of 
interest and said at least one oligonucleotide to thereby quantify 
the at least one mRNA of interest in the biological sample. 

68. The method of claim 67, wherein said at least one 
oligonucleotide is attached to a solid substrate. 

69. The method of claim 68, wherein said solid substrate is 
configured as a microarray and whereas said at least one oligonucleotide 
includes a plurality of oligonucleotides each attached to said microarray in a 
regio-specific manner, 

70. The method of claim 67, wherein said at least one 
oligonucleotide is labeled and whereas step (b) is effected by quantifying said 
label. 

71. The method of claim 67, wherein a length of said at least one 
oligonucleotide is selected from a range of 15-200 nucleotides. 

72. The method of claim 67, wherein said at least one oligonucleotide 
is a single stranded oligonucleotide. 

73. The method of claim 67, wherein said at least one oligonucleotide 
is a double stranded oligonucleotide. 
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74. The method of claim 67, wherein a guanidine and cytosine 
content of said at least one oligonucleotide is at least 25 %. 

75. A method of quantifying the expression potential of at least one 
mRNA of interest in a biological sample, the method comprising: 

(a) contacting the biological sample with at least one pair of 
oligonucleotides including a first oligonucleotide capable of 
binding the at least one mRNA of interest and a second 
oligonucleotide being capable of binding a naturally occurring 
antisense transcript complementary to the mRNA of interest; and 

(b) detecting a level of binding between the at least one mRNA of 
interest and said first oligonucleotide and a level of binding 
between said naturally occurring antisense transcript 
complementary to the mRNA of interest and said second 
oligonucleotide to thereby quantify the expression potential of the 
at least one mRNA of interest in the biological sample. 

76. The method of claim 75, wherein a length of each of said first and 
second oligonucleotides is selected from a range of 1 5-200 nucleotides 

77. The method of claim 75, wherein said first and second 
oligonucleotides are single stranded oligonucleotides. 

78. The method of claim 75, wherein said first and second 
oligonucleotides are double stranded oligonucleotide. 

79. The method of claim 75, wherein a guanidine and cytosine 
content of each of said first and second oligonucleotides is at least 25 %. 
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80. The method of claim 75, wherein said first and second 
oligonucleotides are labeled and whereas step (b) is effected by quantifying said 
label. 

81. The method of claim 75, wherein said first and second 
oligonucleotides are attached to a solid substrate. 

82. The method of claim 81, wherein said solid substrate is 
configured as a microarray and whereas each of said first and second 
oligonucleotides includes a plurality of oligonucleotides each attached to said 
microarray in a regio-specific manner. 

83. A method of quantifying at least one naturally occurring antisense 
transcript of interest in a biological sample, the method comprising: 

(a) contacting the biological sample with at least one oligonucleotide 
capable of binding with the at least one naturally occurring 
antisense transcript of interest, wherein said at least one 
oligonucleotide is designed and configured so as to be 
complementary to a sequence region of the naturally occurring 
antisense transcript of interest, said sequence region not being 
complementary with a naturally occurring mRNA transcript; and 

(b) detecting a level of binding between the at least one naturally 
occurring antisense transcript of interest and said at least one 
oligonucleotide to thereby quantify the at least one naturally 
occurring antisense transcript of interest in the biological sample. 

84. The method of claim 83, wherein said at least one 
oligonucleotide is attached to a solid substrate. 
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85. The method of claim 84, wherein said solid substrate is 
configured as a microarray and whereas said at least one oligonucleotide 
includes a plurality of oligonucleotides each attached to said microarray in a 
regio-specific manner. 

86. The method of claim 83, wherein said at least one 
oligonucleotide is labeled and whereas step (b) is effected by quantifying said 
label. 

87. The method of claim 83, wherein a length of said at least one 
oligonucleotide is selected from a range of 15-200 nucleotides. 

88. The method of claim 83, wherein said at least one oligonucleotide 
is a single stranded oligonucleotide. 

89. The method of claim 83, wherein said at least one oligonucleotide 
is a double stranded oligonucleotide. 

90. The method of claim 83, wherein a guanidine and cytosine 
content of said at least one oligonucleotide is at least 25 %. 

91. A method of identifying a novel drug target, the method 
comprising: 

(a) determining expression level of at least one naturally occurring 
antisense transcript of interest in cells characterized by an 
abnormal phenotype; and 

(b) comparing said expression level of said at least one naturally 
occurring antisense transcript of interest in said cells 
characterized by an abnormal phenotype to an expression level of 
said at least one naturally occurring antisense transcript of interest 
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in cells characterized by a normal phenotype, to thereby identify 
the novel drug target. 

92. The method of claim 91, wherein said abnormal phenotype of 
said cells is selected from the group consisting of biochemical phenotype, 
morphological phenotype and nutritional phenotype. 

93. The method of claim 91, wherein said determining expression 
level of at least one naturally occurring antisense transcript of interest is 
effected by at least one oligonucleotide designed and configured so as to be 
complementary to a sequence region of said at least one naturally occurring 
antisense transcript of interest, said sequence region not being complementary 
with a naturally occurring mRNA transcript. 

94. The method of claim 93 , wherein a length of said at least one 
oligonucleotide is selected from a range of 15-200 nucleotides. 

95. The method of claim 93, wherein said at least one oligonucleotide 
is a single stranded oligonucleotide. 

96. The method of claim 93, wherein said at least one oligonucleotide 
is a double stranded oligonucleotide. 

97. The method of claim 93, wherein a guanidine and cytosine 
content of said at least one oligonucleotide is at least 25 %. 

98. The method of claim 93, wherein said at least one oligonucleotide 
is labeled and whereas step (b) is effected by quantifying said label. 



BNSDOCJD: <WO„ _ .03046220A1. L> 



WO 03/046220 PCT/I L02/00904 

125 

99. The method of claim 93, wherein said at least one 
oligonucleotide is attached to a solid substrate, 

100. The method of claim 99, wherein said solid substrate is 
configured as a microarray and whereas said at least one oligonucleotide 
includes a plurality of oligonucleotides each attached to said microarray in a 
regio-specific manner. 

101. A method of treating or preventing a disease, condition or 
syndrome associated with an upregulation of a naturally occurring antisense 
transcript complementary to a naturally occurring mRNA transcript, the method 
comprising administering a therapeutically effective amount of an agent for 
regulating expression of the naturally occurring antisense transcript. 

102. The method of claim 101, wherein said agent for regulating 
expression of the naturally occurring antisense transcript is at least one 
oligonucleotide designed and configured so as to hybridize to a sequence region 
of said at least one naturally occurring antisense transcript. 

103. The method of claim 102, wherein said at least one 
oligonucleotide is a ribozyme. 

104. The method of claim 102, wherein said at least one 
oligonucleotide is a sense transcript. 

105. A method of diagnosing a disease, condition or syndrome 
associated with a substandard expression ratio of an mRNA of interest over a 
naturally occurring antisense transcript complementary to the mRNA of 
interest, the method comprising: 
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(a) quantifying expression level of the mRNA of interest and the 
naturally occurring antisense transcript complementary to the 
mRNA of interest; 

(b) calculating the expression ratio of the mRNA of interest over the 
naturally occurring antisense transcript complementary to the 
mRNA of interest, thereby diagnosing the disease, condition or 
syndrome. 

106. The method of claim 105, wherein quantifying said expression 
level of the mRNA of interest and the naturally occurring antisense transcript 
complementary to the mRNA of interest is effected by at least one pair of 
oligonucleotides including a first oligonucleotide capable of binding the mRNA 
of interest and a second oligonucleotide being capable of binding the naturally 
occurring antisense transcript complementary to the mRNA of interest, 

107. The method of claim 106, wherein a length of each of said first 
and second oligonucleotides is selected from a range of 1 5-200 nucleotides 

108. The method of claim 106, wherein said first and second 
oligonucleotides are single stranded oligonucleotides. 

109. The method of claim 106, wherein said first and second 
oligonucleotides are double stranded oligonucleotides. 

110. The method of claim 106, wherein a guanidine and cytosine 
content of each of said first and second oligonucleotides is at least 25 %. 

111. The method of claim 106, wherein said first and second 
oligonucleotides are labeled. 
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112. The method of claim 106, wherein said first and second 
oligonucleotides are attached to a solid substrate. 

113. The method of claim 112, wherein said solid substrate is 
configured as a microarray and whereas each of said first and second 
oligonucleotides includes a plurality of oligonucleotides each attached to said 
microarray in a regio-specific manner. 
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Fig. 4a 3/48 

57O_0 AV705532_0 190 Z44352JL5 783 OL: 52 

Query: 1 ggacccaggatatgagcggaaaacactttctctacttagatacaactttttc 52 

I I I I I I I 1 I I I I I I I 1 I Ml I I I I I I I I I I I I I I I I I I I II! M I I I I M 1 I 
Sbjct: 52 ggacccaggatatgagcggaaaacactttctctacttagatacaactttttc 1 



Fig. 4b 



57G_1 AV705532_0 190 Z4 4 3S2JL4 1649 OL: 52 

Query: 1 ggacccaggatatgagcggaaaacactttctctacttagatacaactttttc 52 

I I | I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I N I I II I I M i II I I 
Sbjct: 52 ggacccaggatatgagcggaaaacactttctctacttagatacaactttttc 1 



Fig. 4c 



570_2 AV705532_0 190 Z44352_13 1861 OL: 52 

Query: 1 ggacccaggatatgagcggaaaacactttctctacttagatacaactttttc 52 

II | II I I I I I I I I I I I I I I I I I I II II I I I I I I I II II I! I I I ! I I I I M M 
Sbjct: 52 ggacccaggatatgagcggaaaacactttctctacttagatacaactttttc 1 



Fig. 4d 



571_0 AW070860_0 214 T81142_7 1934 OL: 54 

Query: 1 gtaagggaactttggcgacttagtgcgatcactgggagaattgtagagtccact 5<j 

I 11 II I I ! I I I I I I I I I I I 1 I I I I I I II I I I I I I I I I I II I M M I I II I II II 
Sbjct: 1215 gtaagggaactttggcgacttagtgcgatcactgggagaattgtagagtccact 1162 

Score = 22-3 bits (11) , Expect « 0.66 
Identities = 11/11 (100%) 
Strand - Plus / Minus 

Query: 14 6 acttccagagg 156 

I I I 11 I II II I 
Sbjct: 7 60 acttccagagg 7 50 



Fig. 4e 



571_1 AW070860_0 214 T81142_6 2353 OL: 54 

Ouery: 1 gtaagggaactttggcaacttagtgcgatcactgggagaattgtagagtccact 54 

II I | I I I I I I I I I I I I I I II I I II I II I I I I I I I I I II I II I I I I I I I M 1 I I I 
Sbjct: 1215 gtaagggaactttggcgacttagtgcgatcactgggagaattgtagagtccact 1162 



Score =22.3 bits (11)/ Expect » 0.66 
Identities - 11/11 (100%) 
Strand = Plus / Minus 



Query: 14 6 acttccagagg 156 

I I I I II I I I I I 
Sbjct: 760 acttccagagg 750 
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Score = 22.3 bits (11) , Expect « 0.66 
Identities - 11/11 U00%) 
Strand =* Plus / Plus 



Query: 100 ggaaaacacac 110 

I I I I I I 1 I I I I 
Sbjct: 1900 ggaaaacacac 1910 

Fig. 4f 



571_2 AW07O86O_O 214 T81142_J 2500 OL: 54 

Query 1 gtaagggaactttggcgacttagtgcgatcactgggagaattgtagagtccact 54 

I | | | | I | I I I M I II I I ! I I I I I I I M I I I I i I I 1 I II I ! II M M I I I ! M I I 
Sbjct; 1317 gtaagggaactttggcgacttagtgcgatcactgggagaattgtagagtccact 12 64 

Score = 22.3 bits (11), Expect = 0.66 
Identities = 11/11 (100%) 
Strand - Plus / Minus 



Query: 14 6 acttccagagg 156 

I I I IS I I I II I 
Sbjct; 862 acttccagagg 852 



Score =22.3 bits (11), Expect « 0.66 
Identities = 11/11 (100%) 
Strand - Plus / Plus 



Query: 100 ggaaaacacac 110 

I I II I I I I I I I 
Sbjct: 2047 ggaaaacacac 2057 



Fig. 4g 

571__3 AW070860_0 214 T81142_3 947 OL: 54 

Query: 1 gtaagggaactttggcgacttagtgcgatcactgggagaattgtagagtccact 54 

I I II I I I M II 1 I I I I I I I I I M M I I M I I I II I I M I I I I I I I I II I I M II 
Sbjct: 224 gtaagggaactttggcgacttagtgcgatcactgggagaattgtagagtccact 171 



Fig. 4h 

571 4 AW070860_0 214 T81142_2 1366 OL: 54 

Query: 1 gtaagggaactttggcgacttagtgcgatcactgggagaattgtagagtccact 54 

I I I I I I I I II I I I I 1 I I I I 1 I I I II I II I I I I M I I I I I I I I I I I I I I I I ! M I 
Sbjct: 224 gtaagggaactttggcgacttagtgcgatcactgggagaattgtagagtccact 171 



Score - 22.3 bits (11), Expect « 0.66 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO, .03046220A1..L> 



WO 03/046220 



PCT7IL02/00904 



5/48 

Identities - 11/11 (100%) 
Strand - Plus / Plus 



Ouery: 100 ggasaacacac 110 

I I I I I I 1 I I I I 
Sbjct: 913 ggaaaacacac 923 

Fig. 4i 

572 0 BE046369_0 422 W26553J3 1532 OL: 52 

Query: 1 aatcttcataatccccatgtgtcaaaggagagaccaggtggaggtaactgaa 52 

I I I 1 1 I I I I 1 I I I ! 1 1 I I I I 1 I I I I 1 M 1 1 1 I I I I I I i I i I I 1 I 1 I 1 1 i I 1 I 
Sbjct: 1481 aatcttcataatccccatgtgtcaaaggagagaccaggtggaggtaactgaa 1532 

Fig. 4j 

572_1 BE046369_0 422 W26553_2 1753 OL: 52 

Query: 1 aatcttcataatccccatgtgtcaaaggagagaccaggtggaggtaactgaa 52 

I I I I i I I) I I i I I i I I I I I I I 1 I I I I I I 1 I i I I M I I I 1 I I I M I i I I I 1 I I 
Sbjct: 1702 aatcttcataatccccatgtgtcaaaggagagaccaggtggaggtaactgaa 1753 

Fig. 4k 



572_2 BE04 6369__O 4 22 W26553JL 1832 OL: 52 

Query: 1 aatGttcataatc^catgtgtcaaaggagflgaccaggtgga«ggtaactgaa 52 

I I I I ! I I I I I I M I I I I I I I I I I I M 1 I M I I II I M I I I I I I I I I I Ml 1 I 
Sbjct: 1781 aatcttcataatccccatgtgtcaaaggagagaccaggtggaggtaactgaa 1832 
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53BP1J76P 53BP1 10394 76P 6837 OL: 3046 OF1: 5463 

OF2: 2018 

Score - 1659 bits (837) , Expect = 0.0 
Identities - 840/841 (99%) 
Strand = Plus / Minus 



Query: 74 32 gagacggaatttcgctcttgttgcccaggctggaatgcaatggcacaatctcagctcact 

M91 M | | (I II I I I I t I I I I I i I II 1 I II I I II I I M II I 1 M I I I I I Mi I I M 1 1 I I I I I I 

Sbjct: 3113 gagacggaatttcgctcttgttgcccaqqctggaatgcaatqqcacaatctcagctcact 

3054 



Query: 74 92 gcagcgtctgcttcccaggttcaagcaattctcctgtctcagcctcctgagtagctggga 
7551 

I M I I I 1 1 1 1 M M ! I M t I M I M 1 I! I I I M I I I M 1 1 I I 1 I I M M 1 I 1 I M I 1 1 I 1 
Sbjct : 3053 gcagcgtctgcttcccaggttcaagcaattctcctgtctcagcctcctgagtagctggga 

2994 



Query: 7552 ttacaggcacatgccaccacacctggctaatttt tgtatttttagtagaatcgaggtttc 
7611 

I 1 | I 1 1 1 1 1 1 1 1 I 1 11 I 1 I 1 I 1 1 I 1 1 I I I 1 I I I I 1 1 I 1 I 1 1 1 1 1 1 1 1 1 I I 1 1 1 I 1 1 I t 1 I 
Sbjct : 2993 ttacaggcacatgccaccacacctggctaatttttgtatttttagtagaatcgaggtttc 

2934 



Quexy : 7 612 a Lea Ly LtgyLcaygctyy Lcteaaactcctgacttcaggtgatccgcccgcctcggcct 
7671 ' 

I ! M I I II I I H ! I I I I I I I M I M M I I M I I I M I I I I II I I I I I I I M I I I I M I I 1 
Sbjct: 2933 atcatgttggtcaggctggtctcaaactcctgacttcaggtgatccgcccgcctcggcct 

2874 

Query: 7672 cccaaagtgctgggattacaggtgtgagccaccatgcccggcctaagaaatacttttaag 
7731 

j| | M | | J j| ) I I I II N II I I I I I I I I M I I M I I lit M Ml II II I 1 M I I I 1 M M 
Sbjct : 2873 cc;caaagtgctgggatt acaggtgtgagccaccatgcccggcctaagaaatactttraag 

2814 

Query: 7732 tatattttcattagctagaattgcccaatctgtgtaggtataaattacttggtataggga 
7791 

(i 1 1 1 hi in 1 1 m f 1 1 r f 1 1 f ii m m n ii m ? i in ii nun t n in ni u 

Sbjct: 2813 tatattttcattagctagaattgcccaatctgtgtaggtataaattacttggtataggga 
2754 

Query: 7792 gagagaaagcctatcttacctgttgctttcttacttggtggtaacatccagcagttagtc 
7851 

U II I I II I I It I I I I 11 I M I M 11 11 M f I 1 M I I I I II I M I I H M I I I I M 1 I i I 
Sbjct : 2753 gagagaaagcctatcttacctgttgctttcttacttggtggtaacatccagcagttagtc 
2694 
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Query: 
7911 

Sbjct: 
2634 



7/48 

7852 tatttataaacataattactttttcacatatgaaccataaaatatttaactttctgctct 

I I I I I I II I I I I M I I I M It 1 I M I M I 1 1 1 I I 1 I H 1 1 I I I I 11 I 1 I I M M I I 1 1 1 I 
2693 tatttataaacataattactttttcacatatgaaccataaaatatttaactttctgctct 



Query: 
7971 

Sbjct: 
2574 



7912 atattgtttgtttaccgctgtatctcccacagcttgaacagtaccaaggtacgtagtagg 

NIMIIMII If MIMIimmilMINNINIIlNINIiniMMIHl 
2633 atattgtttgtctaccgctgtatctcccacagcttgaacagtaccaaggtacgtag-tagg 



Query: 
8031 

Sbjct: 
2514 



7972 tgctcaataaatgactattgaataaatgaacatatccaacaaatgttctcaatgtaaagg 

| || | | | I I I I I M I I II I I I II I I II I I 11 II II H I I I H I I I I M I I I I I M I ft I I I 
2573 tgctcaataaatgactattgaataaatgaacatatccaacaaatgttctcaatgtaaagg 



Query : 
8091 

Sbjct: 
2454 



8032 atcagagatgccacatgttctccttgatgggagagacccttccacatgggaatgatggga 

I ) I I I I II M I II I I I II I II I I M I I M I III I 1 I I I HI M 1 I I II M II I I I 

2513 atcagagatgccacatgttctccttgatgggagagacccttccacatgggaatgatggga 



Qoeary ? 
8151 

Sbjct: 
2394 



8092 aggagttgtactcctggatgttcagtaactgcttctaggagaaaaggtagagtcctatca 



2453 



| |l | I I | || | I II I 11 M t I I I I I II I M I I I H J 1 It J i I M I M I I 1 I 1 It II I I I I I 
aggagttgtactcctggatgttcagtaactgcttctaggagaaaaggtagagtcctatca 



Query: 
8211 

Sbjct: 
2334 



8152 ctaagccgcagatatttatttgtgtgtggctagaatgggatgttttgaatcttctgttac 



2393 



i I I I I I I I I I I I I 1 I I I I II II I t II I I I M I 11 1 II I \ I I I I I I I I I I I I HI I t I I I I 
ctaagccgcagatatttatttgtgtgtggctagaatgggatgttttgaatcttctgttac 



Query: 
8271 

Sbjct: 
2274 



8212 aaccttgggaacgtggctgttatttcaatttatgagccagaaattttcacatcccgaaac 

I I I 1 1 I I I I I I I I I I I II 11 I I I M M I M M I I I I 1 I I 1 11 1 II M II I 1 I 1 I I I I I I t 
2333 aaccttgggaacgtggctgttatttcaatttatgagccagaaattttcacatcccgaaac 



Query: 8272 t 8272 
1 

Sbjct: 2273 t 2273 



Score = 1655 bits (835), Expect 
Identities - 849/856 (99%) 
Strand *= Plus / Minus 



0.0 



Fig. 5a continued 
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Query: 5903 agatattgctttaggggtatttgatgtggtggtgacggacccctcatgcccagcctcggt 
5962 

I M 1 I 1 I 1 K I I ! 1 1 1 1 t 1 I 1 I I I I I I 1 1 ! I K 1 1 M I 1 I 1 I 1 I I I 1 1 I I ! I I I I 1 I I 1 I ! I 
Sbj ct : 4 642 agatattgctttaggggtatttgatgtggtggtgacggacccctcatgcccagcctcggt 
4583 



Query: 5963 gctgaagtgtgctgaagcattgcagctgcctgtggtgtcacaagagtgggtgatccagtg 
6022 

I! I I I II II I M I I I I I I II I I 1 II I II M 1 I I I I I I I I II It I I It II I I I M I ! I I I I 
Sbjct: 4582 gctgaagtgtgctgaagcatfcgcagctgcctgtggtgtcacaagagtgggtgatccagtg 
4523 



Query: 6023 cctcattgttggggagagaattggattcaagcagcatccaaaatataaacacgattatgt 
6082 

I I I I I I I M II I I M I I I I I I I I I I I I I I I I I I I I I I M I M I I I II I I I I I I I I I I I I I 
Sbj ct : 4 522 cctcattgttggggagagaattggattcaagcagcatccaaaatataaacacgattat gt 
4463 



Query : 6083 ttctcactaaagatacttggt cttactggt t ttattccctgctatcgtggagatt gtgtt 
6142 

II I I I I II I I I I I I I I M I I I I I I I I I II I I I M M I I I I I I I M I I M M M I II I M I 
Sbjct : 4462 ttctcactaaagatacttggt cttactggttttattccctgctatcgtggagattgtgtt 
4403 



Query: 614 3 ttaaccaggttttaaatgtgtcttgtgtgtaactggattccttgcatggatcttgtatat 
6202 

'l I I I I I I I II I I I I I I I 1 I I I I I I I II I I I M I I I I I I I I I I I II I I II I I I I I 1 I I I I I 
Sbjct : 4402 ttaaccaggttttaaatgtgtcttgtgtgtaactggattccttgcatggatcfctgtatat 
4343 



Query: 6203 agttttatttgctgaact-tttatgataaaataaatgttgaatctctttggttgtagtaac 
6262 

I I I I I I I I I I I I I I II II I I I II I I I I II I) I I II I II I I M II I I M I I I ( I I II I II I 
Sbjct : 4342 agttttatttgctgaacttttatgataaaataaatgttgaatctctttggttgtagtaac 
4283 



Query: 6263 tgggatttcttcatctgnnnnnnngagcttaatctcagaacaaatgacaagacatagtac 
6322 

I I I I I I I II M I I I M I I I ( I II I H I f II f II M II I II I I I I I I M I I I I I 

Sbjct : 4282 tgggatttcttcatctgtttttttgagcttaat ctcagaacaaatgacaagacatagtac 
4223 



Query: 6323 tttctctgagtctttcaacaggcttattcacttacggaggacagctcaccaaggaaattg 
6382 

I M I I II f If M f I II M II I I II / / I M I M M I I I N I M H II M M M I I I I I I J i 
Sbjct : 4222 tttctctgagtctttcaacaggcttattcactt acggaggacagctcaccaaggaaattg 
4163 



Query: 6383 aaaagttaagagtgaacL L La ttt:Lg tggcatcattcccaanaggttattccagggtgtc 

Fig. 5a continued 



SUBSTITUTE -SHEET (RULE 26) 

BNSDOCID: <WO .03046220A1J. > 



WO 03/046220 



PCT/IL02/00904 



9/48 



Sbjct: 4162 
4103 



I 11 | 11 1 I I I M 1 II I I M I I I I H I I M I II 1 I II I II II I I I I I I! I ! I II i II 11 I I 
aaaagttaagagtgaactttattctgtggcatcattcccaaaaggttattccagggtgtc 



Query: 
6502 



64 4 3 taaaatgct atgcttgcagaaact cagtttaaggtaggtgaaggcccagattaacagttg 



Sbjct: 4102 
4043 



HI | MM MMIIIII I Ml II Mill I r 1 f III IIMMIi II IN llllltll MM 
taaaatgctatgcttgcagaaactcagtttaaggtaggtgaaggcccagattaacagttg 



Query: 6503 tgccaaaagttgagtggaattgggcacagctctgtttcctgacagttaaaaaagacctca 
6562 

M | 1 | n M M It I II I II I ) H ) I II M II I M 1 I I I I I ! M I I I I II M II II M I II 
Sbjct: 4042 tgccaaaagttgagtggaattgggcacagctctgtttcctgacagttaaaaaagacctca 

3983 



Query: 
6622 

Sbjct: 
3923 



6563 tgctctctctctgagctgagatcacBgctcacctgtgggtactccccaacrtcttagQgct 



39B2 



I M M I I M M I M I M I I II I I I I I I I I I I I I I II M I I M II I I It II 1 I 11 I I I I 1 I 
tgctctctctctgagctgagatcacagctcacctgtgggtactccccaactcttagagct 



Query: 
6682 

Sbjct : 
3863 



6623 aaagggagaacgaaaggaccaactgccatgaagggacagtgaccataagcttgatggaat 



3922 



II | | M 1 I | | I I I I I I II I I I Ml I I I I I I II II I I I I M I II M I II II I M I II I M I 
aaagggagaacgaaaggaccaactgccatgaagggacagtgaccataagcttgatggaat 



Query : 
6742 

Sbjct: 
3803 



6683 gaccttccgtaagataaacatgggaagcacaagtgagaacacctggaaatgttacacgtt 

I M IMMM IHIMMM I i f miMMM MM M miMNMMH IMIMi 
3862 gaccttccgtaagataaacatgggaagcacaagtgagaacacctggaaatgttacacgtt 



Query: 6743 
Sbjct: 3802 



ctagtcaaagacccaa 67 58 

11 I I I I I I I I I I I I I I 
ctagtcaaagacccaa 3787 



Score = 1211 bits (611), Expect 
Identities 625/632 (98%) 
Strand = Plus / Minus 



0.0 



Query: 6778 gt cacaatagctggaagcagttccttcccttcctctggcatcactgatccctgcatggct 
6837 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r 1 1 1 1 1 f r i f 1 1 1 1 f 1 1 ti 1 1 M i f t m r f 1 1 1 1 1 1 1 

3767 gtcacaatagctggaagcagttccttcccttcctctggcatcactgatccctgcatggct 



Sb j ct : 
3708 



Fig. 5a continued 



BNSDOCID: <WO_ _03046220Al..l.> 



SUBSTITUTE-SHEET (RUtE-26>- -« 



WO 03/046220 



PCT/IL02/00904 



Query: 
6897 

Sb j ct : 
3648 



10/48 

6838 tctcattctctaaagcaggggtcaacaaggnnnnnnnctgtaaagggtcaaagagtaaat 

| | II | ||t I I l l I HI M I I I II I I M I I I II I M I II I Mil II II I I I I I I 

3707 tctcattctctaaagcaggggtcaacaaggtttttttctgtaaagggtcaaagagtaaat 



Query: 6898 atttcaggctttgtgggccatttgatccatcacaactactcgcctttgctgtgagggcat 
6957 

1 1 1 1 I I 1 1 II 1 I I M M M » 1 1 1 M I I I M 1 1 t 1 1 It 1 1 I I I 1 1 I I I I I 1 1 1 1 M 1 1 I M 
Sbjct: 3647 atttcaggctttgtgggccatttgatccatcacaactactcgcctttgctgtgagggcat 

35BB 



Query: 6958 gaaagcaaccatagacaatgagtaaacaaatgggcacggctgligrtttcagtaaaactgta 
7017 

I I f ! 1 1 1 1 I 1 1 M I I II I I I 1 M I I M I 1 M I I 1 1 I M 1 I I II I 1 1 I I I M M M M I 1 1 
Sb j ct : 3587 gaaagcaaccat agacaatgagtaaacaaatgggcacggctgt-gtttcagtaaaactgta 
3528 



Query: 7018 caaaaacagacagcaggccatagtttgccagctcctgctccagagacagcagtggaaagg 
7077 

I II II M I II I I I I II I II 11 I II I I I M M I I I 1 M II I I II I I I I I I I Ml I M I I I I 
Sbjct: 3527 caaaaacagacagcaggccatagtttgccagctcctgctccagagacagcagtggaaagg 

3468 



Query: 7078 gtgatctttagtt gataatagcagggaataagttgtcagagcttcccagtgtgtgtagaa 
7137 > 

I I I 11 | 1 1 I I 1 I I I I I I II M 1 I I I I I I M I I I I I I I I I M I I I M Ml I I I II I M I II 
Sbjct : 34 67 gtgatctttagttgataatagcagggaataagttgtcagagcttcccagtgtgtgtagaa 

3408 



Query: 7138 tatgtagtgatgaaaaccagatgcagtgactataacctgatgccagaacactgcattctt 
7197 

I M I II I I M I I M ! II M I II I I M I M M M M M I I M II II M M I I I I 11 M I I I 
Sbjct : 3407 tatgtagtgatgaaaaccagatgcagtgactataacctgatgccagaacactgcattctt 

334 8 



Query: 7198 tttcagtttggagggcgttgttcagtgaatatttctttttacttacactgatatgaatat 
7257 

I I I I I I I I I I M I I ! 11 II M I I M I M i I M II I I M I M I M I I I I I II II I I I I M I 
Sbjct: 3347 tt tcagttt ggagggcgttgtitcagtgaatatttctttttacttacactgatatgaatat 
3288 



Query: 7258 tgattaccagtgatggctgggccatattaagataacttcaacccctatggtttgtgtaag 

73J1 I 1 I I I II I I I I I I I I I I I I I I I M I II I I I M I I I I I I I I I M I I I M I I I I I M M I I ! 

Sbjct : 3287 tgattaccagtgatggctgggccatattaagataacttcaacccctatggtttgtgtaag 



3228 



Query: 7318 atgggtaattgggcctgcaatcttcagtatttaaaaatctaacaacttgatctcaatttt 

7377 Fig. 5a continued 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO.. 



.03046220A1 l_> 



WO 03/046220 



PCT/IL02/00904 



Sbjct: 3227 
3168 



11/48 

1 1 | I 1 1 1 1 1 1 1 I I I 1 I 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 1 1 1 1 ! 1 i 1 1 1 I M I t 1 I I I I I I 1 1 1 I 
atgggtaattgggcctgcaatcttcagtatttaaaaatctaacaacrtgatctcaatttt 



Query: 7378 
Sbjct 



ttcttaaggacctttttcttggagaataatac 74 09 
1 1 I 1 1 1 1 I I I 1 t I M I 1 1 I I f I I M 1 1 1 1 1 1 I 
3167 ttcttaaggacctttttcttggagaataatac 3136 



Score - 404 bits (204) , Expect - e-115 
Identities = 204/204 (100%) 
Strand » Plus / Minus 



Query: 
5615 

Sbjct s 
6110 



5556 cagtgtaacacagcttaccagtgtcttctaattgcggatcagcattgtcgaacccggaag 

1 1 I I IIM I I I I M I I I I I I I I I I M I I I I II M I M I I I I It I I M M I I 1 I I I 1 I I I 1 
6169 cagtgtaacacagcttaccagtgtcttctaattgcggatcagcattgtcgaacccggaag 



Query: 
5675 

Sbjct: 
6050 



5616 tacttcctgtgccttgccagtgggattccttgtgtgtctcatgtctgggtccatgatagt 



6109 



M I II I I i I U II 1 I I I I t I 1 I I I I I M I I ! I i I M 1 I 1 ! I I I I 1 U I I I I I I t I i I 1 1 I 
tacttcctgtgccttgccagtgggattccttgtgtgtctcatgtctgggtccatgatagt 



Query : 
5735 

Sbjct: 
5990 



5676 tgceatgccaaccagctccagaactaccgtaattatctgttgccagctgggtacagcctt 



604 9 



' I I II I II I I I II I I 1 I I I II I 1 I I II I I 1 1 I I I I I I II M 1 I I M II I I I I I I II I II I I 
tgccatgccaaccagctccagaactaccgtaattatctgttgccagctgggtacagcctt 



Query; 5736 gaggagcaaagaattctggactgg 57 59 

It Mil I MM II I t I I 1 I II II 1 
Sbjct: 5989 gaggagcaaagaattctggactgg 5966 



Score = 291 bits (147), Expect 
Identities - 147/147 (100%) 
Strand » Plus / Minus 



le-80 



Query : 
5817 

Sbjct: 
5100 



5758 ggcaaccccgtgaaaatcctttccagaatctgaaggtactcttggtatcagaccaacagc 



5159 



I I M M I I 11 M I 11 11 I I I I II M M M M M I M M M I M M M I I M I I M I I M I 
ggcaaccccgtgaaaatcctttccagaatctgaaggtactcttggtatcagaccaacagc 



Query: 5818 agaaettcctggogctctggtctgagatccrtcatgactggtggtgcagcctctgtgoagc 
5877 

I I | ! I I I I I I I I I M I I I I I II I I I I II I I I I 11 I I II I I I II M I I I I II t M I H I M 
Sbjct i 5099 agaacttcctggagctctggtctgagatcctcatgactggtggtgcagcctctgtgaagc 

5040 



Fig. 5a continued 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO„ 



„ .03046220A1J„> 



WO 03/046220 



PCT/IL02/00904 



12/48 

Query: 5878 agcacuattcaagtgcccataacaaag 5904 

II I I I II II II II MINIMUM I I 
Sbjct: 5039 agcaccattcaagtgcccataacaaag 5013 

Score - 281 bits (142) , Expect « 9e-78 
Identities - 142/142 (100%) 
Strand » Plus / Minus 

Query: 8920 ctgcccagagttccaccagcctgggtatagtatttgttataatctagtcgtaacagtagt 
8979 

| n I I M N I II I II I II I I N N II I M I I I M I II N N II 1 II M II I II M I N I I 
Sbjct : 2274 ctgcccagagttccaccagcctgggtatagtatttgttataatctagtcgtaacagtagt 

2215 

Query: 8980 tgagccaaatctgagttgatct gatgattccgaanact ggagagaat cttgaacaggagt 

9039 , , 

F 1 i ] I I f I I 1 I I | I 1 1 I M I I I 1 1 I t 1 M 1 1 I I I t I K I I I I t I 1 I M I M 1 1 1 I 1 I I I I I 
Sbjct: 2214 t gagccaaatctgagttgatctgatgatt ccgaacactggagagaatcttgaacaggagt 

2155 

Query: 904 0 gaagactggcggctaaagccct 9061 

J II II N I II I II N I II I I N 
Sbjct: 2154 gaagactggcggctaaagccct 2133 

Score = 226 bits (114), Expect * 5e-61 
Identities = 117/118 (99%) 
Strand Plus / Minus 

Query: 9673 ccttcacgagaatgctcagctgggcggctccacgctcatccagtgggcctaggttctgac 
9732 

I I I I I I I I | | | || III I N II II M IN II I I IN MN I M M M I N I N M \\ III! 

Sbjc:1- : 2135 cat tcacgag*at gr.tnagct gggcggnt-r.cacgctcatccagtgggcctaggtf ctgac 
2076 

Query: 9733 tgaccagcgaacaaaaactgtgacagagatctaggatttcattcaggcagtgaaacac 9790 

M 11 M 11 | I i | I II I I I M M I II M I II I I I N I) I M II I M N I I I I N I I I I 
Sbjct: 2075 tgaccagcaaacaaaaactgtgacagagatctaggatttcattcaggcagtgaaacac 2018 

Score = 190 bits (96), Expect » 3e-50 
Identities = 96/96 (100%) 
Strand Plus / Minus 

Query: 54 63 gaatttttggaaattcctcctttcaacaagcagtatacagaatcccagcttcgagcagga 
5522 

NIMMNIIIINNIMMIMIINNINNMNINIMMNMNIIMM 
Sbjct: 6812 gaatttttggaaattcctcctttcaacaagcagtatacagaatcccagcttcgagcagga 
6753 

Fig. 5a continued 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO_ 



^0304S220A1. I r > 



WO 03/046220 



13/48 



PCT/IL02/00904 



Query: 5523 gctggctatatccttgaagatttcaatgaagcccag 5558 

| MMMII MM MIIIMMIMUMim Ml 
Sbjct: 6752 gctggctatatccL LgaagaUttcaatgaagcccag 6717 



Score « 52*0 bits (26), Expect - 2e-08 
Identities - 26/26 (100%) 
Strand « Plus / Minus 



Query: 7668 gcctcccaaagtgctgggattacagg 7 693 

I M M M M M M M M M M 1 1 M 1 
Sbjct: 5312 gcctcccaaagtgctgggattacagg 5287 



Fig. 5a continued 



0304e220Al.J_> 



SUBSTITUTE SHEET (RULE 26) 



WO 03/046220 



PCT/IL02/00904 



14/48 

CIDEB1_BLTR2 CIDEB1 2289 BLTR2 6530 OL: 2254 OF1: 17 OF2 : 

1 



Score - 2727 <753.8 bits), Expect = 0.0, Sum P(13) m 0.0 
Identities - 547/549 (99%), Positives - 547/549 C99%), Strands Minus / 
Plus 



Query : 2250 TTTTGTTAGTTTGAGGGGAAGGGTATGAAGACAGATCTCAAGGTAAAGTCAGAGAGGGCr 
2191 

I I I ) I I J J II I 1 I M I I II I I I II 11 I I I I I I I I I M II I 1 I I I M I I I 1 II ! I I II I I I 
Sbjct: 1 TTTTGTTAGTTTGAGGGGAAGGGTATGAAGACAGATCTCAAGGTAAAGTCAGAGAGGGCT 60 

Query : 2190 GTCATCAGTATGCTGGGGAGTTTAGGGACAGGAGGCATTGGTAGGGGATTAGATGTAGCA 
2131 

II I I I 1 II I II II I I I I f ( I I II It M I ( U I N I M f M I II i I I ) I I I I) i I J I I I II 
Sbjct : 61 GTCATCAGTATGCTGGGGAGTTTAGGGACAGGAGGCATTGGTAGGGGATTAGATGTAGCA 
120 



Que r y : 2130 GC AGTCAGGCTGGGATCAAGATGCCTGGGGGACATCTTG ATCTTGGCCTTTCAGGGCAAG 
2071 

I I 11 I I I I II I I I I I I I I I I I M I II I I [ I I I I I I M ( I I I I I I I I If 11 II I I M / M I 
Sb j Ct : 121 GCAGTCAGGCTGGGATCAAGATGCCTGGGGG ACATCTTGATCTTGGCCTTTCAGGGCAAG 
180 



Query : 2070 TGGGAGGCTAGAAAGGTGGCTAGGA71AGAACAGCATTCTTCAGGTAAGGGTATAGACTTG 
2011 

I I II II II I I M II I I I 1 I I II I I II II I II M If I I I 11 M I M I I I II II M I 11 M 

Sb j ct : 161 'i'GGGAGGCCAGAAAGGTGGCTAGGAAAGAACAGCATTCTTCAGGTAAGGGTAT AGACTTG 
240 



Query: 
1951 

Sbjct: 
300 



2010 GGATGTGAGGCGTTATGCTGAAAGGTTCTGTCACGAGGGGATCAGAGGACAGTGGGGAAA 

I I II I I I II II II I I I i I I II I I I I I I I II II I 1 I I I M I I I f I I I I I If II U II M M 

241 GGATGTGAGGCGTTA TGCTGAAAGGTTCTGTCACGAGGGGATC AGAGGACAGTGGGGAAA 



Query: 
1891 

Kbjr.t: 
360 



1950 TTGGGTGGGTTATCTAGCCTGTACTGTCTGCAGGTCCTGAT^ATTTGATGCTGTCATAGTC 

I M I 1 I I M I I I II I I I i I I I 11 I II I I I I M I I M I II I I I I I I I I I II II I I I f I II I 
301 TTGnGTGGGTTATCTAGCrrTGTAnTGTCTnCARnTCCTGAAATTTRATGCT^TCATAGT^ 



Query: 
1831 

Sbj ct: 
420 



1B9U TTTGCAGTGGGTCGGTTGGAATGATTCTGGGGGCAGAAGCTCAGAGCCCCTT AGTAGGAA 

I I I I I I I I 1 I I I I I M M II I II I I M I I I I I I I I I I I I I I I I I I I I I I I I ! I I I II M I 
361 TTTGCAGTGGGTCGGTTGGAATGATTCTGGGGGCAGAAGCTCAGAGCCCCTTAGTAGGAA 



Query i 
1771 

Sbjct: 
480 



1830 TGGAGGCGGCCCTTCTGCTGCCACTGCTCAGCCCCCTCCACTGCATGACGAAGGGTGGAG 

I I I I 1 1 I I I I I I I I I I M I I I I I I I I II I I I I I I I 1 I I I Ml I II I 11 I I II I I I M I I I 
421 TGG AGGCGGCCCTTCTGCTGCCACTGCTCAGCCCCCTCCACTGCATGACGAAGGGTGGAG 



Query : 
1711 

Sbjct: 
540 



177 0 G AAATTCCCAGC7VACATATCGCCCAGGCCTTGCAGCAGTGTGGAGGTCCAACGAAGGAGC 

I I I I I 1 I I I I II I II II I I 1 I I I I I II I I I I II I I II I I I II I II M M I I I I I I I II I 1 
4 81 GAAATTCCCAGCAACATATGGCCCAGGCCTTGCAGCAGTGTGGAGGTCCAACGAAGGAGC 



Fig. 5b 



SUBSTITUTE SHEET (RULE 26) 



.0304622041 I 



WO 03/046220 



PCT/IL02/00904 



15/48 



Query: 1710 TCCCTGAGT 1702 

I I I I I I I I 
Sbjct: 541 TCCCTGAAT 54 9 

Score «= 1322 (365.4 bits), Expect - 0»O, Sum P<13) « 0.0 
Identities m 266/268 (99%), Positives « 266/268 (99%), Strands Minus / 
Plus 

Query : 757 CCTGTAGGCCCAGTVAGGATGTCGGTCTGCTACCGTCCCCCAGGGAACGAGACACTGCTGA 
698 

I I I 1 1 I I I M I I I I I I 1 I 1 I 1 1 1 1 1 I I I I I 1 1 ! I 1 1 I I I 1 1 I I 1 1 1 I I I 1 I I 1 I I I 1 t I I 
Sb j C t : 5426 CCTGTAGGCCCAGAAGGATGTCGGTCTGCTACCGTCCCCCAGGGAACGAGACACTGCTGA 

5485 

Query : 697 GCTGGAAGACTTCGCGGGCCACAGGCACAGCCTTCCTGCTGCTGGCGGCGCTGCTGGGGC 
638 

II M I I I I I II ! I I I I I I I I I I I I I II I I I I I I I M I M Ml I I I I I II I II 1 ! I I I I I I 

Sb j ct : 54 86 GCTGGAAGACTTCGCGGGCCACAGGCACAGCCTTCCTGCTGCTGGCGGCGCTGCTGGGGC 
5545 

Quory : 637 TGCCTCCCAACCCCTTCGTGGTGTGGAGCTTGCCCGGCTGGCAGCCTCCACCCCGGCCAC 
578 

I I I I I II II I I I I I I I I II I I I I I I I I I I I M M I 1 I I I I I I I I I I I I I I II I I I 1 I 11 
Sb j ct : 554 6 TGCCTGGCAACGGCTTCGTGGTGTGGAGCTTGGCGGGCTGGCGGCCTGCACGGGGGCGAC 
5605 

Query : 577 CGCTGGCGGCCACGCTTGTGCTGCACCTGGCGCTGGCCGACGGCGCGGTGCTGCTGCTCA 
518 

I I I I I I I II I I I I M I I I I I M II I I I I II I I M I I I M I I i I t 1 ) M M I I I I 1 I I M I 
Sb j ct : 5 60 CGCTGGCGGCCACGCTTGTGCTGCACCTGGCGCTGGCCGACGGCGCGGTGCTGCTGCTCA 
5665 

Query: 517 CGCCGCTCTTTGTGGCCTTCCTGACCGG 4 90 

I I I I I I I I I 1 I I I I I I I I 1 II I I I M I 
Sbjct: 5666 CGCCGCTCTTTGTGGCCTTCCTGACCCG 5693 

Score = 1316 (363.8 bits), Expect » 0-0, Sum P(13) =0.0 
Identities - 264/265 (99%), Positives - 264/265 (99%), Strands Minus / 
Plus 

Query : 421 CAAGCGTGCTGCTCACCCGCCTGCTCACCCTGCAGCGCTGCCTCGCAGTCACCCGCCCCT 
362 

I I I I U I I I I I I I I I I II I I I I I II I I II M I I I II II I I I I I I I I I I I I I I I II M I I 

Sb j ct : 5762 CCAGCGTGCTGCTCACCGGCCTGCTCAGCCTGCAGCGCTGCCTCGCAGTCACCCGCCCCT 
5821 

Query : 361 TCCTGGCGCCTCGGCTGCGCAGCCCGGCCCTGGCCCGCCGCCTGCTGCTGGCGGTCTGGC 
302 

I I I I I II I I I M I I I I I I I II I I I I II I I I I I I M I I I I M I I I I I I II I I I I II I I II I 
Sb j c t : 5822 TCCTGGCGCCTCGGCTGCGC AGCCCGGCCCTGGCCCGCCGCCTGCTGCTGGCGGTCTGGC 
5881 

Query : 301 TGGCCGCCCTGTTGCTCGCCGTCCCGGCCGCCGTCTACCGCCACCTGTGGAGGGACCGCG 
242 

I I I I I II i I I I I M 1 1 I I I I I II II I II I I I I I I I I I I I I I I I I I I I M 1 1 I I I I I II I I 
SbjuL : 5862 TGGCCGCCCTGTTGCTCGCCGTCCCGGCCGCCGTCTUCCGCCfrCCTGTGGhGGGACCGCG 

Fig. 5b continued 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCIO: <WO. 



__0304622aA1J..> 



WO 03/046220 



PCT/IL02/00904 



16/48 



Query: 
182 

Sbjct s 
6001 



241 TATGCCAGCTGTGCCACCCGTCGCCGGTCCACGCCGCCGCCCACCTGAGCCTGGAGACTC 
I M I T 1 1 I ! 11 M 1 1 M I 1 1 M I 1 1 I 1 1 1 1 1 1 M 1 1 1 i I I I M I M i I I I I M 11 I I I 1 1 

5 94 2 TATGCCAGCTGTCCCACCCCTCCCCGGTCCACGCCGCCGCCCACCTGAGCCTGGAGACTC 



Query: 181 TGACCGCTTTCGTGCTTCCTTTCGG 157 

I M I I M i I I I i I I I I I I I I I I I M 
Sbjct: 6002 TGACCGCTTTCGTGCTTCCTTTCGG 6026 

Score - 920 (254.3 bits), Expect « 0.0, Sura P(13) = 0.0 
Identities ~ 188/193 (97%) , Positives - 188/193 (97%), Strands Minus / 
Plus 



Query: 
1649 

Sbjct : 
783 

Query: 
1589 

Sbjct: 
843 

Query*. 
1529 

Sbjct: 



1708 CCTGAGTACTTTCTTTGGGCCAAGTCCTTGAAAGTCACAACTCATAGAGTAGAGCCCGTA 
M 1 I It I I I 1 I 11 1 1 1 I I 1 1 1 1 V t i I 1 1 1 1 1 I I M t 1 U 1 1 1 M I K M I I I I M II M I V 

724 CCTGAGTACTTTCTTTGGGCCAAGTCCTTG7VAAGTCACAACTCATAGAGTAGAGCCCGTA 
1648 GAATGTGGCTTTGACATTCAGGCTGCCAAAGAGGTCTCGAGGGTTTTGCTTGTACACGTC 



704 



1 1 M M M I 1 1 1 I I I M I I M 11 I M I 1 1 1 1 M I I M I II I M I I M II I II 1 M I II 1 1 

GAATGTGGCTTTGACATTCAGGCTGCCAAAGAGGTCTCGAGGGTTTTGCTTGTACACGTC 



1588, AAAGGTGAATCGGCCGATGTCCTTGCTGTGCTTGGGGCTCTCCCGTCCAGGCCCATATGA 

I I I M II 11 I t II I I I I I I 11 M I I 1 I I I M I I M MMIIHIII I III DIM 
844 AAAGGTGAATCGGGCGATGTCCTTGCTGTGCTTGGGCCTCTCCCGTCCCAGGCCATATGA 



903 




Query: 


1528 


Sbjct : 


904 


Score 


- 753 


Identities » 


Plus 




Query: 


1529 


1470 




Sb j ct : 


1139 


1198 




Query : 


1469 


1410 




Sbjct: 


1199 


1258 




Query: 


1409 


Sbjct: 


1259 


Score 


« 746 



1 1 1 1 M I II I i 1 1 



157/165 (95%), Positives « 



Sum PC13) = 0.0 
157/165 (95%), 



Strands Minus / 



ACAGCACTCCACTCCTTGTAGGGCTCCAGCTCTGACCAGACTGCAACACCATCAGGCACG 

11! Ml I I I 1 1 1 | M I I I I I I I I 1 I 1 1 I I I M I 1 1 1 1 I I I I 1 I I 1 ! 1 1 1 I I 1 I 

ATAGGCCTCTTACCCTTGTAGGGCTCCAGCTCTGACCAGACTGCAACACCATCAGGCACG 

TGTCATCCTCCAGCAGCTGGAAGAAGTCCTCACTGTCCACTGCAGTTCCATCCTCCTCTA 

I I MM I I M M I II I M I I II I M I I Ml M I M M I I I M II 11 I I I 1 I I I I I II II I 
TGTCATCCTCCAGCAGCTGGAAGAAGTCCTCACTGTCCACTGCAGTTCCATCCTCCTCTA 



GCACCAGGGTTAGCACTCCATTCAGCAGTAGGGTCTCCAATGCTT 
I 1 II M M M I I 1 I I I I I M I M I M I M M M 11 M M M M I 
GCACCAGGGTTAGCACTCCATTCAGCAGTAGGGTCTCCAATGCCT 



1365 



1303 
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BNSDOCID: <WO 03D46Z20A1. I.> 



SUBSTITUTE SHEET (RULE 26) 



WO 03/046220 



PCT/IL02/00904 



17/48 



Identities - 150/151 
Plus 



(99%), Positives. = 150/151 (99%), Strands Minus / 



Query: 
1310 

Sb j ct : 
2205 

Query; 
1250 

Sbjct: 
2265 



1369 TGCTTTGGCTAGCAGCTCCTGGCGGGTGGCAGCTGTCAGGCCTTTCCGGATGGTCCGCTT 

t M | M I 1 1 M I 1 1 M I t I I I I M II 1 1 I I I I M I M 1 1 1 I i M M M M 1 1 I I M Ml 

214 6 TACTTTGGCTAGCAGCTCCTGGCGGGTGGCAGCTGTCAGGCCTTTCCGGATGGTCCGCTT 

1309 GTGATCACAGACACGGAAAGGTCGCTGGGGTGGTGGAGCTGAGGTCCAGACCCTCCGTCC 

I I ! 1 1 M 1 I I I K I I M I I I I ! 1 I M M M II I M I I I 11 1 1 M I U I I I M M I II I I I I 
2206 GTGATCACAGACACGGAAAGGTCGCTGGGGTGGTGGAGCTGAGGTCCAGACCCTCCGTCC 



Query: 1249 

Sbjct: 2266 

Score «*» 737 
Identities ■ 
Plus 



AAACTCCGAGCTTATATTAGATACTGACCTG 1219 

II I lllll il MMM I Mill M MM Ml 
AAACTCCGAGCTT ATATTAGAT ACTGACCTG 22 96 

(203.7 bits), Expect ~ 0-0, Sum P(13> «- 0*0 
- 148/150 (98%), Positives * 149/150 (99%). 



Strands Minus / 



Query: 
1059 

Sb j ct : 
3316 

Query: 
999 

Sbjct : 
3376 



1118 CTGCTCTTTCCTTCCTCCTTGGTCGGAGGAGGGGCTGGCTCACTGCTCTGGCTTCATTTT 

I I M I I I I 1 I I 1 I I I M I I I I I I I II I I M I I I I I I I I I I I I I I I I I I i I I I I I I IMM 
3257. CTGCTCTTTCCTTCCTCCTTGGTCGGAGGAGGGGCTGGCTCACTGCTCTGGCTTCATTTT 



1058 



CCAGAGCTGCCTGCTGCAGTCACACTTAGGTCATCTTCTCTCACTTTTCTCCTTTTGCCG 



I M M M II II II M I I I M I II II II ! I I II I I 11 I I I I M I I I I M M II M I M ! II 
3317 CCAGAGCTGCCTGCTGCAGTCACACTTAGGTCATCTTCTCTCACTTTTCTCCTTTTGCCG 



Query: 998 AT TACTG G ACGTG ACAGAG ATGTG AAT RTG 969 

I M II II I I I M M I M I I I I I I M II + I 
Sbjct: 3377 ATTAGTGGACGTGACAGAGATGTGAATGGG 3406 



Score = 714 (197.4 bits), Expect - 0-0, Sum P{13) * 0-0 

Identities - 146/150 (97%), Positives - 146/150 (97%), Strands Minus / 



Plus 

Query: 
107 

Sbjct: 
6077 

Query : 

Sbjct: 
6137 



166 



TTCCTTTCGGCTGATGCTCGGCTGCTACAGCGTGACGCTGGCACGGCTGCGGGGCGCCCG 



| | || I | | | | I I I II I M M M I M M I M M M M M M M I II M M M M M M 
6018 TCCTTTCGGGCTG ATGCTCGGCTGCTACAGCGTGACGCTGGCACGGCTGCGGGGCGCCCG 

106 CTGGGGCTCCGGGCGGCACGGGGCGCGGGTGGGCCGGCTGGTGAGCGCCATCGTGCTTGC 
I 1 I M M M M I M M M M II M M II M M M M M I M M I It II M M M I I II 11 
6078 CTGGGGCTCCGGGCGGCACGGGGCGCGGGTGGGCCGGCTGGTGAGCGCCATCGTGCTTGC 



47 



Query: 4 6 CTTCGGCTTGCTCTGGGCCCCCTACCACGC 17 

M M M M II M I I M M M 11 II M M M 
Sbjct: 6138 CTTCGGCTTGCTCTGGGCCCCCTACCACGC 6167 
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BNSDOCID: <WO. JW048220A1. I .> 



SUBSTITUTE SHEET (RULE 26) 



WO 03/046220 



PCT/IL02/00904 



18/48 



Score ■ 638 
Identities * 



Plus 




Query: 


962 


903 




Sb j ct : 


3404 


0-3 




Query: 


902 


843 




Sbjct: 


34 64 


3523 




Query: 


842 


Sb j ct : 


3524 


Score 


= 537 



(176.4 bits), Expect - 0.0, Sum P<13) - 0.0 
130/133 (97%), Positives - 130/133 (97%), Strands Minus / 



GGGCAGGGGATGTCCTTTGATGGCATCAAGACTTTAGCTTCTGGTGCGCTGTGTCCCAGC 

Ml I II I 1 1 1 I I I I M I J I ) I) I II M 111 M I I I I M I 1 1 I M I! I I I I 1 I I 11 I I 
GGGGCAGGGATGTCCTTTGATGGCATCAAGACTTTAGCTTCTGGTGCGCTGTGTCCCAGC 



TCTGATTTCAGTTGCAGCCGTGATGGACAGTTGCATGGMGGTGAGACTCTCACTGACAG 

I I I I II I II I I I ! I I I I II I HI I 1 1 II II I II II I II 1 I I t I I I I M I I 11 I I I I 1 I M 
TCTGATTTCAGTTGCAGCCGTGATGGACAGTTGCATGGAAGCTGAGACTCTCACTGACAG 



TGAAACCCTCAAA 830 
I I I I I I I I I I 1 I I 
TGAAACCCTCAAA 3536 



(148.4 bits), Expect « 0.0, 
Identities « 109/111 (98%), Positives = 
Plus 



Sum P(13) = 0.0 

IU9/H1 <98%), Strands Minus / 



Query: 
1168 

Sbjct: 
264 9 



1227 ACTGACCTGAGTAAGTCACTGGGGTTCAGAGCTGAGAGGTACTCCATGGTGGACCGGAGA 

I I I I I I I I I I I I I I I I I I M I I I I I I I M I I 1 I I 1 M I 1 M I I M I I I I I It I I I 1 M 
2590 AGTCACCTGAGTAAGTCACTGGGGTTCAGAGCTGAGAGGTACTCCATGGTGGACCGGAGA 



Query: 1167 GTTCCTTCCCTGGAACTTCTGGGCTGGGTGGTTCTCTCCTGTGCTGGGGCT 1117 

M I II I I I II I I I I I I I I I I I I 1 I M I H I I II 11 I II I I I I I I I 11 I I I I 
Sbjct: 2650 GTTCCTTCCCTGGAACTTCTGGGCTGGGrGGTrCTCTCCTGTGCrGGGGCT 27 00 

Score ~ 394 (108.9 bits), Expect = 0.0, Sum P(13) «= 0.0 

Identities = 82/86 (95%), Positives « 82/86 (95%), Strands Minus / Plus 



Query: 
7B0 

Sbjct: 
3585 



83 9 AACCCTCAAAATGAACACAATCCCTGCTTTCCTGCCAAGGATCCTTGTAGGGTCCCCCAG 

M II I N I I I I I I I I I I I I I 1 I I I I I H M I I 11 I I I 1 I I I I II I I I I I I I I I 1 I I 
3526 AAACCCTCAAATGAAC ACAATCCCTGCTTTCCTGCC AAGG ATCCTTGTAGGGTCCCCCAG 



Query: 779 CTTCCCCACTTTTTTTCTGTGTCCTG 754 

I I I I I M I 1 I I I I II I I I 1 1 I I I | I I 
Sbjct: 3586 CTTCCCCACTTTTTTTCTGTGTCCTG 3611 

Score • 370 (102.3 bits), Expect • 0.0, Sum P(13) * 0.0 

Tdenril-ies = 74/74 (1 00%), Positives <= 74/74 <10O?O, Strands Minus / Plus 



Query: 
434 

Sbjct i 
5750 



493 CCGGCAGGCCTGGCCGCTGGGCCAGGCGGGCTGCAAGGCGGTGTACTACGTGTGCGCGCT 

) I I i I I I 11 I ) M I I I I I I I I I I I I I 1 I I M I 1 I M I M II I II I I I I I II I I I 1 II I 11 
5691 CCGGCAGGCCTGGCCGCTGGGCCAGGCGGGCTGCAAGGCGGTGTACTACGTGTGCGCGCT 



Query: 433 CAGCATGTACGCCA 420 
M I I 1 I I II 1 I I I I 

Sbjct: 57 51 CAGCATGTACGCCA 57 64 
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SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 



03048220A1J_> 



WO 03/046220 



PCT/IL02/00904 



CIDEB2_BLTR2 
1 



CIDEB2 



19/48 

1511 BLTR2 6530 



OL': 



1410 OF1: 



OF2: 



Score - 2736 (756,0 bits), Expect = 0.0, Sum P{5) = 0.0 

548/549 (99%), Positives - 548/549 (99%), Strands Minus / 



Identities 
Plus 



Query: 
1325 

Sbjct: 

Query: 
1265 

Sbjct r 
120 

Query : 
1205 

Sbjct: 
180 

Query s 
1145 

Sbjct: 
240 

Query : 
1085 

Sbjct : 
300 

Query: 
1025 

Sb j ct : 
360 

Query: 
965 

Sb j ct : 
420 

Query: 
905 

Sbjct: 
480 

Query: 
845 

Sb j ct : 
540 



1384 TTTTC5TTAGTTTGAGGGGAAGGGTATGAAGACAGATCTCAAGGTAAAGTCAGAGAGGGCT 

I 1 I I I I I t N M M I M 1 I U M ! I I 1 M 1 I I 1 I I t M II I I U t M I I I M H I K M 11 
1 TTTTGTTAGTTTGAGGGGAAGGGTATGAAGACAGATCTCAAGGTAAAGTCAGAGAGGGCT 

1324 GTCATCAGTATGCTGGGGAGTTTAGGGACAGGAGGCATTGGTAGGGGATTAGATGTAGCA 



60 



61 



I I | | I | | II I 11 I I M 1 I I M II I I I I I I N il I I M t I II II I 11 M M I II I I I I I I I 
GTCATCAGTATGCTGGGGAGTTTAGGGACAGGAGGCATTGGTAGGGGATTAGATGTAGCA 



1264 GCAGTCAGGCTGGGAtCAAGATGCCTGGGGGACATCTTGATCTTGGCCTTTCAGGGCAAG 

1 1 1 1 I I I M I 1 1 1 Ml I I II II M 1 1 1 1 I 1 1 1 1 H 1 1 1 U I I 1 1 I I H I I I M H 1 1 1 I I 
121 GCAGTCAGGCTGGGATCAAGATGCCTGGGGGACATCTTGATCTTGGCCTTTCAGGGCAAG 

1204 TGGGAGGCCAGAAAGGTGGCTAGGAAAGAACAGCATTCTTCAGGTAAGGGTATAGACTTG 



181 



I | | I I I I I M I 1 1 I 1 I I » I I M I I II i i I I I 1 I I H I II II I M II I II II I I I I II I I I 
TGGGAGGCCAGAAAGGTGGCTAGGAAAGAACAGCATTCTTCAGGTAAGGGTATAGACTTG 



114 4 GGATGTGAGGCGTTATGCTGAAAGGTTCTGTCACGAGGGGATCAGAGGACAGTGGGGAAA 

! I I I I I M I 1 II I M I II M I II II 1 1 I II I I II I I I II I II I I II II I II 1 1 I I I I I I I I 
241 GGATGTGAGGCGTTATGCTGAAAGGTTCTGTCACGAGGGGATCAGAGGAC AGTGGGGAAA 

1084 TTGGGTGGGTTATCTAGCCTGTACTGTCTGCAGGTCCTGAAATTTGATGCTGTCATAGTC 



301 



I 1 M I I I I I | I I 1 I N I I M i I I I M I I I I I I I I I I 1 I I I I I I I ) I 1 I 1 I 1 I I I I I M 1 1 
TTGGGTGGGTTATCTAGCCTGTACTGTCTGCAGGTCCTGAAATTTGATGCTGTCATAGTC 



1024 TTTGCAGTGGGTCGGTTGGAATGATTCTGGGGGCAGAAGCTCAGAGCCCCTTAGTAGGAA 

I I I I I I I I I I I I I I II I I I I M I I I I I I I IM I I I I I I M I I Ml I H I I II I I I I I I I I 
361 TTTGCAGTGGGTCGGTTGGAATGATTCTGGGGGCAGAAGCTCAGAGCCCCTTAGTAGGAA 

964 TGGAGGCGGCCCTTCTGCTGCCACTGCTCAGCCCCCTCCACTGCATGACGAAGGGTGGAG 

I i I I II 1 I M I I 11 I 1 I H 11 I I I I I I I 1 1 1 I I H I 1 I 1 I I I 1 I I I I M M i 1 I M 1 II t 
4 21 TGGAGGCGGCCCTTCTGCTGCCACTGCTCAGCCCCCTCCACTGCATGACGAAGGGTGGAG 

904 GAAATTCCCAGCAACATATGGCCCAGGCCTTGCAGCAGTGTGGAGGTCCAACGAAGGAGC 



481 



M ( I i I M II M 1 1 f I ( 1 f f f f M f 1 1 M r M 1 1 1 ( 1 1 M M M I I M 1 1 1 I M M M M 
GAAATTCCCAGCAACATATGGCCCP \CGAAGGAGC 



Fig. 5c 



BNSDOCID: <WO. .03046220A1 I > 



SUBSTITUTE SHEET (RULE 26) 



WO 03/046220 



PCT/IL02/00904 



20/48 

Query: 84 4 TCCCTGAGT 836 

II I I i I I ! 
Sbjct: 541 TCCCTGAAT 549 

Score » 1787 (493.8 bits), Expect = 0.0, Suiu P(5> = 0.0 

Identities - 359/361 (99%), Positives - 359/361 (99%) , Strands Minus / 
Plus 



Query: 
302 

Sb j ct : 
2649 

Query: 
242 

Sb j ct : 
2709 

Query: 
1fl2 

Sbjct: 
Z769 

Query: 
122 

Sbjct: 
2829 

Query: 

Sb j ct : 
2889 

Query : 

S& j ct : 
294 9 

Query: 

Sb j ct : 



361 ACTGACCTGAGTAAGTCACTGGGGTTCAGAGCTGAGAGGTACTCCATGGTGGACCGGAGA 



2590 



I | 1 | | | l M I I I I M I 1 I f 1 I I I I I 1 1 I I I 1 1 M II Y 1 1 1 1 M I M M M I I I 1 I M I 
AGTCACCTGAGTAAGTCACTGGGGTTCAGAGCTGAGAGGTACTCCATGGTGGACCGGAGA 



301 GTTCCTTCCCTGGAACrTCTCiGGCTGGGTGGTTCTCTCCTGTGCTGGGGCTTTAGTGGTG 

I | | | | | | M I I II I I I I I M I It It I I I I I I I M M I I II i I I M It I I M Ml I I II I I 
2650 GTTCCTTCCCTGGAACTTCTGGGCTGGGTGGTTCTCTCCTGTGCTGGGGCTTTAGTGGTG 

241 TTTTCTGTTACAAACCTGGGATCTCAGCCCAGGACAAGGTGGGAATGAGTCAAGCCTGGA 

I I M I I I I II II I I III I I I I I M I II I I I II III! M I I I I I I II M M I I I I II U I I 
2710 TTTTCTGTTACAAACCTGGGATCTCAGCCCAGGACAAGGTGGGAATGAGTCAAGCCTGGA 

181 CTCTGGCCCCCCTGCCTGGCCAGTAAGAAGGGCAAAGTCCAAGGGGAGGGATGAGGGAGG 
> llllllllllfMIMIIIMMMMIIMIimmiiMHMiHIllMMIII 

2770* ClCTCiGCCCCCUTGCCU'LJG^CAGT/^tiAAGGGCAAAUTCCAAGGGGAGGGATGAGGGAGG 

121 GGCCAGATGGGGTCCTGGAGGAAGAATTGCCTGGCAAAAGCCATTGGAGCTTGTATGTGT 
I 1 I ! I 1 I I I 1 I M 1 I I I M t M M I M K ! 1 I I I ! I M 1 I I M I I I » 1 I 1 I I I I I M M I I 

2830 GGCCAGATGGGGTCCTGGAGGAAGAATTGCCTGGCAAAAGCCATTGGAGCTTGTATGTGT 



62 



61 GTCTTTGGTGATGACATGTGTTGTGAGGGTAGATGGGAACCATGTAAAAGGATGAAATGT 2 
I I I I I I I I I M I I I I I 1 I I I I I I 1 I I 1 I I I II I M I I I I I I II I I 1 I M 1 I I I M I I I I I 

2890 GTCXI'TGGUXSATGAC^TGTG^ 



1 G 1 
I 

2950 G 2950 



Score 965 (266.6 bits). Expect = 0.0, 
Identities = 193/193 (100%), Positives 
Plus 



Sum P(5) - 0.0 
= 193/193 (100%) 



Strands Minus / 



Query: 
783 

Sb j ct : 
783 

Query: 
723 



842 CCTGAGTACTTTCTTTGGGCCAAGTCCTTGAAAGTCACAACTCATAGAGTAGAGCCCGTA 



724 



I || ! I M 1 1 I I I I I I I I I I I 11 I I t I I II 11 II I I I M I I II II I I I II I I I I M 1 i II I 
CCTGAGTACTTTCTTTGGGCCAAGTCCTTGAAAGTCACAACTCATAGAGTAGAGCCCGTA 



782 GAATCTGGCTTTGACATTCACGCTGCCAAAGAGGTGTCGAGGGTTTTGCTTGTACACGTC 

Fig. 5c continued 
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BNSDOCID: <WO. . 



J. > 



WO 03/046220 



PCT/ 1 LO 2/00904 



21/48 



0D3 CI Z 


"7ft4 


843 




Query: 


722 


663 




Sbjct : 


844 


903 




Query : 


662 


Sbjct : 


904 


Score 


= 757 


Identities ! 


Plus 




Query: 


671 


612 




Sb j ct : 


1131 


1190 




Query: 


611 


552 




Sbjct: 


1191 


1250 




Query: 


551 


Sbjct: 


1251 


Score 


- 746 


Identities « 


Plus 




Query: 


503 


444 




Sbjct: 


2146 


2205 




Query: 


443 


384 




Sbjct: 


2206 


2265 




Query: 


383 


Sbjct: 


2266 



) ) I I J I ) J I I I } I I I I I M I M I II I M i 1 1 I I M 1 11 II I I i I M I 1 11 II I I I I 1 1 »l 
GAATGTGGCTTTGACATTCAGGCTGCCAAAGAGGTCTCGAGGGTTTTGCTTGTACACGTC 

7 22 AAAGGTGAATCGGGCGATGTCCTTGCTGTGCTTGGGCCTCTCCCGTCCCAGGCCATATGA 
I II II II U I I I I II 11 I 11 I II I II I I I I II II II II I M ! I I I I I I II I I I I I M I H 



I 1 I I I I 1 I 1 1 I I I 

CAGCACTCCACTC 916 

(203.2 bits), Expect = 0.0, Sum P(5) = 0,0 
= 161/173 (93%), Positives - 161/173 (93%), Strands Minus / 

GCCATAT GAC AGCACTCCACTCCTTGT AGGGCTCCAGCT CTGACCAGACTGCAACACCAT 

III I I II 111 I I M I 1 I I Ml II I I III II I II 11 II I I I III I I I I II 
GCCCCAGTATAGGCCTCTTACCCTTGTAGGGCTCCAGCTCTGACCAGACTGCAACACCAT 

CAGGCACGTGTCATCCTCCAGCAGCTGGAAGAAGTCCTCACTGTCCACTGCAGTTCCATC 

M 1 I 11 I I I II I I M 1 I 1 I 11 I I 11 1 1 I 1 1 I 1 11 I I I M I I I I I I I 1 I I 1 I I 1 1 I 1 I I I I 
CAGGCACGTGTCATCCTCCAGCAGCTGGAAGAAGTCCTCACTGTCCACTGCRGTTCCATC 



CTCCTCTAGCACCAGGGTTAGCACTCCATTCAGCAGTAGGGTCTCCAATGCTT 4 99 
j I N I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I M H I I I M I I M I 
CTCCTCTAGCACCAGGGTTAGCACTCCATTCAGCAGTAGGGTCTCCAATGCCT 1303 

(206.1 bits), Expect = 0.0, Sum P(5) = 0.0 
* 150/151 (99%), Positives *= 150/151 (99%), Strands Minus / 



TGCTTTGGCTAGCAGCTCCTGGCGGGTGGCAGCTGTCAGGCCTTTCCGGATGGTCCGCTT 

I I 1 1 I 1 I 1 I 1 1 1 I | 1 I I | 1 1 1 I 1 t I I 1 I I I f I 1 I 1 1 1 1 1 I I 1 f 1 I I I I 1 1 1 1 1 1 1 I 1 I I 
TACTTTGGCrAGCAGCTCCTGGCGGGTGGCACCTGTCAGGCCTTTCCGGATGGTCCGCTT 



GTGATCACAGACACGGAAAGGTCGCTGGGGTGGTGGAGCTGAGGTCCAGACCCTCCGTCC 

I I I I I I I I I I M I I II II I III M II M I I I t f M MM N I MM ! M I U M UN M 
GTGATCACAGACACGGAAAGGTCGCTGGGGTGGTGGAGCTGAGGTCCAGACCCTCCGTCC 



AAACTCCGAGCTTATATTAGATACTGACCTG 353 
I 1 I I I I I M I I I I I I II I I I I I I I II I I I M 
AAACTCCGAGCTTATATTAGATACTGACCTG 2296 
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BNSDOCID. <WO. .03046220A1. 1. > 



WO 03/046220 



PCT/1L02/00904 



22/48 



APAFlJEBl APAF1 
OF2: 1612 

Score - 705 
Identities = 
Plus 



7042 



EBla 



1752 



OL: 141 OF1: €889 



(194-8 bits), Expect ~ 3-9e-52, P - 3.9e-52 
• 141/141 (100%), Positives = 141/141 (100%), 



Strands Minus / 



Query: 
6970 

Sbjct : 
1671 

Query: 
6910 

Sbj ct: 
1731 



7029 TGTTTTTCAAAACAATTTTGTGAATTTTATTTTTAC7VAAAATTTTTTAAATTCATATTTT 

i I | | | 1 1 | I I | I I I j I I I I I I I I I I II I I I I I 11 1 1 I N I I I I I I H I ! I I I I I I I 1 1 1 1 
1612 TGTTTTTCAAAACAATTTTGTGAATTTTATTTTTACAAAAATTTTTTAAATTCATATTTT 

6969 AAAATGTATACCAAGGCAAAAAAATCATATAAGCTATATCATAAATACAAGAGTTTCAAA 

I | M M I 1 1 1 I I I 1 1 1 I t 1 I t I I 1 f 1 1 1 1 1 I I I I I ! 1 t M I I 1 1 ! i I t I f I 1 I I I I 1 1 1 t 
1672 AAAATGTATACCAAGGCAAAAAAATCATATAAGCTATATCATAAATACAAGAGTTTCAAA 



Query: 6909 AC ATACAAG AGAC AT ATAAT G 6889 

I I I I I II I I I I I I I I M I I II 
Sbjct: 1732 ACATACAAGAGACATATAATG 1752 



Fig. 5d 



BNSDOCID: <WO. 03046220A1 l„» 



SUBSTITUTE SHEET (RULE 26) 



WO 03/046220 



PCT/ILO 2/00904 



23/48 



Hum_AChR_MINK2 AChR 2457 MINK2 4863 OL: 236 DF1: 2175 

OF2: 4583 

Score ■= 218 bits (110), Expect ** 2e-59 
Identities ^ 1X0/110 (100% J 
Strand « Plus / Minus 

Query: 2254 aagggttacttgctgctcacactatatacagatgcaagcaaggggcgtggagagtgaggg 
2313 

| I I I M M I I 1 I M II U I I U I I I I I I I M I M I 1 11 U 1 I ! I M M M I M 11 I I I I I 
Sbjct : 4787 aagggttacttgctgctcacactatatacaga tgcaagcaaggggcgtggagagtgaggg 
4728 

Query: 2314 ctccctgctccctccctccaccggggaagggcatgggc.tagaagaggaga 2363 

I I K I I I I I I I I f I I M I I I I M I M M M M I M I I I M I M M I I I I M 
Sbjct: 4727 ctccctgctccctccctccaccggggaagggcatgggctagaagaggaga 4678 

Score = 133 bits (67), Expect *= 9e-34 
Identities = 74/75 (98%), Gaps - 1/75 (1%) 
Strand = Plus / Minus 

Query: 2384 aatgttttggctg-cggggtcccccctccattccctggagtttgggggaaggggaatcat 
2442 

II I Mill MM! M I 1 M I I i I I I I I 1 I M I I I 1 I I I M M I M 1 1 1 I U I I I I I 1 M 
Sbjct : 4657 aatgttttggctggcggggtcccccctccattccctggagtttgggggaaggggaatcat 

4598 

Query: 2443* taaagtgctttcaga 2457 

I I I II II I I I M I I I 
Sbjct: 4597 taaagtgctttcaga 4583 

Score = 103 bits (52), Expect - 8e-25 
Identities = 52/52 (100%) 
Strand = Plus / Minus 

Query: 2175 agctggttgaattgtctttattaacaaacaggatatccaaggccactacatt 2226 

I I I I I I I t I M 1 I M I M I I t I I I I I I M I I M I 1 t 1 1 M 1 1 1 1 I 1 M 1 1 1 1 
Sbjct: 4863 agctggttgaattgtctttattaacaaacaggatatccaaggccactacatt 4»iZ 
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BNSDOCID' <WO_ .03M8220A1 ..!..> 



WO 03/046220 



PCT/IL02/00904 



24/48 



Mus_AChR 
934~ 



Mus^AChR 
OF2: 506 



1590 



Anti Mus AChR 



2227 



OL: 672 OF1: 



Score ■ 1221 (337.4 bits), Expect ■ 3.7e-254, Sum P(4) » 3.7e-254 
Identities - 245/246 (99%), Positives « 245/246 (99%), Strands Minus / 
Plus 



Query: 
1528 

Sbj ct: 
565 

Query: 
1468 

Sbjctr 
625 

Query: 
1408 

Sbjct: 
685 

Query: 
1348 

Sbjct s 
745 

Query: 

Sbjct: 



1587 AGTTCACAAACCAGATTTATTGTCAGCGGCCTGTTTTCTiAAATCTCTTTCTTGGGGGGTG 



506 



I I I I | I | 1 I I I I I I I I 11 I II I I I I I I I I I I I I II ) I I II I i I I I I I II I I I I I I I I I I I 
AGTTCACAAACCAGATTTATTGTCAGCGGCCTGTTTTCAAAATCTCTTTCTTGGGGGGTG 



1527 GGGGAGAGGTGGGTGCCAGTGCAGGCTCATGGTTGGATGCACGGTGGGTAAGGGAGATCA 



566 



I I I I H I I I I I I I I I I II M I I I I M I M I II H II I I I I II I I I I I I I I I II I I II I II 
GGGGAGAGGTGGGTGCCAGTGCAGGCTCATGGTTGGATGCACGGTGGGTAAGGGAGATCA 



14 67 GGAACTTGGTTGAAGTAACCCCCAAGGAAGATGAGAGTAGAACCAACGCTGAAGAGCACC 



626 



m m m i m m i m m m i m 1 1 m i m i n j i n 1 1 1 m 1 1 1 1 h 1 1 1 1 1 1 

GGAACTTGGTTGAAGTAACCCCCAAGGAAGATGAGAGTAGAACCAACGCTGAAGAGCACC 



1407 AAAGCTGCCCAAAAACAGACATTGTCCAGGGCCTTCCCCATACGCACCCAGTCGGACAGT 



686 



I I I I I I I I I I II I I I I I I I I I I I II I I I I II I II I II I I I I I I I I I I I I I I I II I I I II I 
AAAGCTGCCCAAAAACAGACATTGTCCAGGGCCTTCCCCATAOGCACCCAGTCGGACAGT 



1347 TCCTCT 1342 
11 I I I 
746 TCCTGT 751 



Score = 954 (263.6 bits). Expect » 3.7e-254, Sum P(4) ■ 
Identities = 198/207 (95%), Positives 198/207 (95%), 
Plus 



3.7e-254 
Strands Minus 



Query: 
1190 

Sbjct: 
1066 

Query: 
1130 

Sbjct: 
1126 

Query: 
1070 

Sbj ct : 
1186 

Query: 



124 9 GCAGAGGGCTGCGGTCCAAGTTCCGTGCCGATGCCTCTGACCCTCAAACACGAGTTCGCT 

lift I I M I I M I II I I If I I I M If I I I M M 1 1 I M N I III IN I I HI M 

1007 GCAGTGCTTACCGGTCCAAGTTCCGTGCCGATGCCTCTGACCCTCAAACACGAGTTCGCT 



1189 CCGCGGCTTTTTCAAGATGAGCTCCTCCGCTCTGAGCAGAATGCCCACAGACGAGGCACG 

I I I I I I I I I I I I 11 | I I I I I I I I I II I II I I I I I M I 1 1 1 1 I I 1 I 1 M 1 1 I I I I I 1 1 1 1 I 
10 67 CCGCGGCTTTTTCAAGATGAGCTCCTCCGCTCTGAGCACAATGCCCACAGACGAGGCACG 

1129 CCTCGCTGGTGAGGCAGTTCGGGGATCCTCTGGGGGTGGGCTCGAGCCCAGGAGACGCGG 

I I I I 1 I I I I II I II | | I I I M I 1 I I I I I I I I I I I II H I M I I I I I I I I I I I I I I II I I I 
1127 CCTCGCTGGTGAGGCAGTTCGGGGATCCTCTGGGGGTGGGCTCGAGCCCAGGAGACGCGG 



2069 CAGCAGCTCTAATAAAATCTGGCGCAG 1043 
I I I I I I I I II I I I I I I I I II I II I 
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BNSDOCID: <WO. 



SUBSTITUTE SHEET (RULE 26) 



_03O4e220A1J_> 



WO 03/04622(1 



PCT/IL02/00904 



25/48 



Sbjct: 1187 CAGCAGCTCTAATAAAATCTGCAGCCG 1213 

Score - 591 (163.3 bits) f Expect * 3.7e-254, Sum P(4) « 3.7e-254 
Identities « 119/120 (99%), Positives « 119/120 (99%), Strands Minus / 
Plus 

Query : 1053 ATCTGGCGCAGCCGAGGGGATGTAGCATGAGTCGTTGGCGTCCTCAAAGATACGTTGAGC 
994 

I I I I I I I I I I I I ! I I I I I I I I I 1 I I I II I I M I 1 I ! II I ! ! I II I I I I I I I I 1 II 1 II I 
Sb j c t : 1277 ACCTGGCGCAGCCGAGGGGATGTAGCATGAGTCGTTGGCGTCCTCAAAGATACGTTGAGC 
1336 

Query : 993 ACGATGACGCAATTCATG ACAATGAGCGTGGCAACCACCATG ACGAATATAAGATACCTG 
934 

I I I II I M I I I I I I I I I I I I I I I I I I I I I I 1 I I 1 I I I I I I I I 1 II I 11 I I I I M M I I I I 
Sb j ct : 1 337 ACGATGACGCAATTCATGACAATGAGCGTGGCAACCACCATGACGAATATAAGATACCTG 
1396 

Score - 546 (150.9 bits). Expect = 3.7e-254, Sum P(4) = 3.7e-254 
Identities = 110/111 (99%), Positives - 110/111 (99%), Strands Minus / 
Plus 

Query : 134 6 CCTCTCCAGTGGCTTCCTGGTCTCTTGTGCTCTCAGCCACAAAGTTCACAGCATCCACAC 
1287 

ri m i h i m i j f m m m m i m i m m h m 1 1 1 m m j 1 1) i j n j j 1 1 

Sbjct : 826 CCTCTCCAGTGGCTTCCTGGTCTCTTGTGCTCTCAGCCACAAAGTTCACAGCATCCACAC 
885 

Query: 1286 AGCAGCGGATTTCTGGGGCTGCAGCACCCAGGTTCTGGCAGAGGGCTGCGG 1236 

II I I II I II I I II I I I I I II I I II I II I II I I II II I II I I I I II f I I I I 

Sb j ct : 88 6 AGCAGCGGATTTCTGGGGCTGCAGCACCCAGGTTCTGGCAGAGGGCTGCTG 936 



Fig. 5f continued 



SUBSTITUTE SHEET (RULE 26) 
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CyclinE2 CyclinE2 2714 Anti_CyclinE2 6773 OL: 1855 

OF1: 865 OF2 : 2006 

Score - 7885 (2178.8 bits). Expect - 0.0, Sum P{4> « 0,0 

Identities - 1577/1577 (100%) f Positives - 1577/1577 (100%), Strands Mxnus 
/ Plus 

Query: 2714 TACAGCTGGCAGCGCAGAGAAGGAAAAAAAAGTTTCTrc^ 

2655 | | I | | | | | | | | 1 1 |] j II M I I I I I I I ! I I I M I M I I I I II 1 I 1 1 I I I I I M I M M I 1 

Sbjct: 2006 TACAGCTGGCAGCGCAGAGAAGGAAAAAAAAGTTTCTCCAAGCAATGGCAAAACTTTACT 
2065 

Query: 2654 tTTAAGCAGTTAAATTTTTTTAACTTTTATTTTTTAAACAATGGGCTAAAAATAT^ACAGT 
2595 

| | | | I | | | | I I U I I II M I I ! I I I I I I I I M I II I i N I ! I I II II I I 1 I I I I I I I M I 
Sbj ct ; 2066 TTTAAGCAGTTAAATTTTTTTAACTTTTATTTTTTAAACAATGGGCTAAAAATAAACAGT 

2125 

Query: 2594 ATTAAAAGGTTAAGTTTATATAATACATATGTACACAATXAGTGGTGTTTTCTTTTCAGA 
2535" 

I I I I I I 1 I I I [ I I 1 1 I I I I 1 1 I II I I M ( I I (Ml I I 1 I I I I I I I 1 1 I Ml 1 1 1 1 I M I I 
Sb j ct : 2126 ATTAAAAGGTTAAGTTTATATMTACATATGTACACAATTAGTGGTGTTTTCTTTTCAGA 

2185 

Query: 2534 CAAAATACTGAAAO^ATATTAGTTTAAAAACAAACTATACAGAAGACTTCATACCGTAA 
2475 

imm in iiiitHimiini! mimiMmmmMiiHMMim 

Sbjct: 2186 CAAAATACTGAAAC7VAATATTAGTTTAAAAACAAACTATACAGAAGACTTCATACCGTAA 
2245 

Query; 2474 CAATAAATGTATAGTTTCTTCAAAGGGAGAAGAGATTCACATATCTGATAACAAAATAAA 
2415 

, I I I I M I 1 M 1 II I I I I I I I I I i I I I I M 1 I I I M 11 I I M I I M I M 1 1 I 1 1 I I I I 11 I 
Sb j C t : 224 6 CAATAAATGTATAGTTTCTTCAAAGGGAGAAGAGATTCACATATCTGATAACAAAATAAA 

2305 

Query : 2414 CTAGCAATCTAGTTTTCTAATCTACTTTATGAGGCTGGATTTTTTTTTTAGAAAAGCTAA 
2355 

llliflllllilllflfllllllflllllllllltlltlllllllllllllllUIIIM 
Sb j ct : 2306 CTAGCAATCTAGTTTTCTAATCTACTTTATGAGGCTGGATTTTTTTTTTAGAAAAGCTAA 

2365 

Query : 2354 TTTAAAATATTTAGAAATAGCTAGCCTATGTACAGCAAGTTTTCATGTCTTTTTTTAATA 
2295 

I 1 I I M t 1 I II I I f I M I 1 1 I I I I i I I 1 I I I 1 1 1 M M M M 1 I I M 1 1 1 M I I I I I M I 

sb j ct . 2366 TTTAAAATATTTAGAAATAGCTAGCCTATGTACAGCAAGTTTTCATGTCTTTTTTTAATA 
2425 

Query: 2294 AATAGATTTCTAGGAGT CAGTATATATT TAATAC TCTTCTTCCTTAAGAAAATAGAAGT T 
2235 

1 11 1 1 11 11 1 1 1 1 1 1 1 11 1 1 1 1 1 11 » 11 1 1 1 1 11 1 1 1 1 1 11 1 1 1 1 h m nil 1 1 1 1 1 1 1 

Sb j ct : 2426 AATAGATTTCTAGGAGTCAGTATATATTTAATACTCTTCTTCCTTAAGAAAATAGAAGTT 
2485 

Query: 2234 TAGGTCAAGTGTTAAGCTTTATCACTTTGACACTGTCCTTATCTCACAATGGAGGAATTT 
2175 

I I I II I I I I I 1 1 II I I I M I I I I I M M I f I I 1 1 I II I I I I I M I I I I I M I M ? I I M I 
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Sbjct : 24 86 TAGGTCAAGTGTTAAGCTTTATCACTTTGACACTGTCCTTATCTCACAATGGAGGAATTT 
2545 

Query: 2174 AGAAAGGACCTTAACAGTTTCACAAACATAAATAAAGCCTTAGTCACACTAAATTAAAAA 

2115 ) 1 1 | 1 1 | | | | 11 | 1 1 I I I | I t I ! I ! I I 1 1 11 1 1 1 M 1 1 I I 111 1 1 1 I I M I III M I 1 1 1 

Sb j ct : 2546 AGAAAGGACCTTAACAGTTTCACAAACATAAATA^ 

2605 

Query: 2114 AAAAAATTCCTTAGGGATATCTTAGAGTAGTAAAGTGACTTCCTCATATAAATAGTTTGA 

I I I I 1 1 I I I I I I I 1 11 I I I I I I I I 1 1 1 1 I I I I I I I I 1 1 I I I M I I I I I I I II II I INN 

Sbjct : 2606 AAAAAATTCCTTAGGGATATCTTAGAGTAGTAAAGTGACTTCCTCATATAAATAGT7TGA 

2665 

Query: 2054 AAGGGTACTTAAGTTTTTCACCCAAATTGTGATATACAA3\AAGGTTATTACCAAGCAACC 
1995 

| I 1 1 1 1 I I I M I I I I I 1 1 1 I ! I I I 1 1 1 M 1 I I 1 I1 1 I M i 1 1 I I 1 M 1 M M 1 M 11 I 1 1 

Sb j ct : 2666 AAGGGTACTTAAGTTTTTCACCCAAATTGTGATATACAAAAAGGTTATTACCAAGCAACC 

2725 

Query: 1 994 TACATGTCAAGAAAGCCCCAGTTAGGAAGGAGCCACAGCATTTATCTTGTTTATAATTTC 
1935 

M M 1 M I II I I I I I I I I M M I I I I I M M I M I M I M I I I I M Ml 1 I M 1 I I I Ml 

SbjcL : 2726 TACATGTCAAGAAAGCCCCAGTTAGGAAGGAGCCACAGCATTTATCTTGTTTATAATTTC 
2785 

Query: 1934 TTTGGTACTCCCACTGTTTAGAGCACAGG T TGAACACCATGTTCAT CTAAGCCTT AT T AG 

1875 | | | | | | n I M I I I I Ml N I I I I I I M I I M I I H 1 M M I I M M 1 I M III I I I I I 1 

Sb j ct : 2786 TTTGGTACTCCCACTGTTTAGAGCACAGGTTGAACACCATGTTCATCTAAGCCTTATTAG 

2845 

Query: 1874 TTAAAAAATGTGTTATGGCAAGGCAAATAAACTAGTTTAAAAAACATTAAATTTCACCAT 

1615 M I 1 I II I II I I II I I M I I III I M M I I I I I I I M M I I I M 11 I I M I H M" J" "I 

Sbj ct : 284 6 TTAAAAAATGTGTTATGGCAAGGCAAATAAACTAGTTT7VAAAAACATTAAATTTCACCAT 

2905 

Qu« £ y : 1814 TTGTAGAAATTC AAGTTTTATAATAGCTTGCTAT AGCAGCTATAGATAAATTAGTCACCT 
1755 

I I M M I I I I I I I I I I I I I I M II I I I I I M I M M I 1 M I M I I I M M I II II 1 I I M 
Sb j ct : 2906 TTGTAGAAATTCAAGTTTTATAATAGCTTGCTATAGCAGCTATAGATAAATTAGTCACCT 

2965 

Query; 1754 TATTACAAAACTAAACCTTTGTAAAC^AGTTTAAATTTAATTTTCAAGAACCAAATTGCA 

1695 | I 1 I I I I I I 1 I 1 I I I I I 1 I I I I M I I I I I I Ml I I I 1 I I I H I 1 I I M U 11 I I II I Ml 

Sbjct : 2966 TATTACAAAACTAAACCTTTGTAAACAAGTTTAAATTTAATTTTCAAGAACCAAATTGCA 

3025 

Ouery: 1694 CTAGTCAAGAGTGTAGGAATTTTGAGAATCTAACAACXAGATTCAAAGTACTGTATCACT 

I I 1 I 1 I I 1 I 1 1 t 1 1 I I 1 I I M 11 1 1 1 1 M 1 I 1 1 I I I 1 I I 1 1 1 1 I 1 1 I I I I I I MM M M 
Sbj Ct : 3026 CTAGTCAAGAGTGTAGGAATITTGAGAAT^^^^ 

3085 

Query : 1 634 TAGTATACCCTTTAAGGTAGCACTTATCCAGTCCAAAACTCCAGTGACAAAATTCCTAGT 

1575 Fig. 5g continued 
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,,,,,,,,,, 1 1 1 1 1 1 1 1 1 j n i ii 1 1 1 1 1 n 1 1 m 1 1 1 1 h i m m ) 1 1 1 1 n i n i 

rAGTATACCCTTTAAGGTAGCACTTATCCAGTCCAAAACTCCAGTGACAAAATTCCTAGT 
TTATCAAGATAAACACAGTAACACTGGATTAAAGGAftRAACATTGCTATGGTATAGACTG 

, i , 1 1 1 1 1 1 m 1 1 m 1 1 1 1 1 1 n i u 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii i m 1 1 1 j 1 1 1 1 1 1 1 1 1 

TTATCAAGATAAACACAGTAACACTGGATTAAAGGAAAAACATTGCTATGGTATAGACTG 
TGGTTGGCTTCTATCCAGTAACCTTGGGAATGAAGACATCTTTGTAARCAAGTCCTGCTG 

ii | II || | 1 1 1 1 1| 1 1 1 1 1 II I 1 1 1 1 M 1 1 1 » 1 1 II 1 1 M 1 1 1 1 1 1 1 N 1 1 11 1 1 1 1 II I 

TGGTTGGCTTCTATCCRGTAACCTTGGGAATGAAGACATCTTTGTARACAAGTCCTGCTG 

TTTCTTTAACAGCTAACATAGGAAATAATTAAATGTATTCTTTAGTGCCAATTGTAAGTT 

I i I I | | | | | | | | | | | | I I I I I J I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I 
TTTCTTTAACAGCTAACATAGGAAATAATTAAATGT^^ 

TTAAAATCAGAATGGCAGTGTAACTTGTGAATTGGCTAGGGCAATCAATCACAGCACTAC 

... ] i I i ■ i i M | I I II I I I M I I I I I I 1 I I I I I I N I I I I I H U I I I I I I I I I I I I I I 
TTAAAATCAGAATGGCAGTGTAACTTGTGAATTGGCTAGGGCAATCAATCACAGCACTAC 

TTTCTGTAAAACTTTAGTAGTTCAGTGATACCAGTTCTACCCAATCTTGGTGAATTCCAA 
I M M I I 1 1 I 1 1 I H I I I 1 1 I 1 1 I I I I I II I I III I 1 1 1 1 1 1 I I II I I I M 1 1 I Mill I 



■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 m 1 1 1 1 1 1 1 1 n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 1 1 1 1 1 1 

CTTGTTTGCTTAGTTATCTTCTTTAGTGTTTTCCTGGTGGTTTTTCAGTGCTCTTCGGTG 
GTGTCATAATGCCTCCATTGCACACTGGTGACAACTGTCCCCCTTTTCTGAAGGTGTTTA 

1 1 1 1 1 1 1 1 1 1 m 1 1 1 1 m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m i m 1 1 1 1 1 1 ii 1 1 1 1 ii i 

GTGTCATAATGCCTCCATTGC^CACTGGTGACAACTGTCCCCCT 



I I I II I I I I I I I I I I I 
•GTAATTTACTTCCTCC 3582 

- HUS (222 4 bits). Expect «■ 0.0, Sum P{4) • -» 0.0 
Cities = ( l"n6? U00%). Positives - 161/161 (100%), Strands Minus / 

Plus 

Query : 1 1 39 CCAGCATAGCCAAATAGTTTGTATGTGTCTGGATATTATGTCTGTCTTCCATAGGAATCT 
1080 ,,,,,, , I I I I I I II I I I I I I I I I 11 I I I I I I I I I I I I I I • I I I I I 1 I I I I I I I I I I I I I 



Sbjct: 


3086 : 


3145 




Query: 


1574 ' 


1515 




Sbjct : 


3146 


3205 




Query: 


1514 


1455 




Sbj ct: 


3206 


3265 




Query: 


1454 






Sbjct: 


3266 






Query: 


1394 






Sbjct: 


3326 


jJO J 




Query: 


1334 


1275 




Shjcf.: 


33R6 


3445 




Query: 


1274 


1215 




Sbjct: 


3446 


3505 




Query: 


1214 


3 3 55 




Sbjct: 


3506 


356b 




Query: 


1154 


Sbjct: 


3566 


Score 


= 805 
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Sbjct: 3967 CCAGCATAGCCAAATAGTTTGTATGTGTCTGGATATTATGTCTGTCTTCC ATA6GAATCT 
4026 

Query : 1079 TCTTAAMGTCTTCAGCTTCACTGCACTAGTACTTTTTACTACATTGACAAAAGCTACCA 
1020 

IIMMIIIIIIIIIIIilll IM) I I f 1 I 1 I I I I 1 I I I I I I I 1 t } 1 1 I I 1 1 I I 1 1 1 Ml 

Sb j Ct : 4027 TCTTAAAAGTCTT<^GCTTCACTGGACTAGTACTTTTTACTACATTGACAAAAGGTACCA 
4086 

Query: 1019 TCCAATCTACACATTCTGAAATACTGTCCCACTCCAAACCT 979 

mmim iMiimmimiiMimmimi 

Sbjct: 4 087 TCCAATCTACACATTCTGAAATACTGTCCCACTCCAAACCT 4127 
Score » 581 (160.5 bits), Expert = 0.0, Sum P(4) « 0.0 

Identities » 117/118 (99%), Positives « 117/118 (99%), Strands Minus / 
Plus 

Query : 982 ACCTGAGGCTTTCTTAACCACTTCAATGGAGGTAAAATGGCACAAGGCAGCAGCAGTCAG 
923 

I M I I M I I I I J II II I I I 1 1 | | ! | M 1 | M I I I i 1 I I 1 1 I I M I II | M 1 M I I I I I I I 

Sb j ct : 4 615 ACCTGAGGCTTTCTTAACCACrTCAATGGAGGTAAAATGGCACAAGGCAGCAGCAGTCAG 
4674 

Query: 922 TATTCTGTACTGGAACTCTAATGAATCAATGGCTAGAATACACAGATCTT^AAGCTGA 8 65 

IHnflllll(fflllUlf/lll/i|j|llllll|j!l|||f|||||l)||||| I 
Sbjct: 4 675 TATTCTGTACTGGAACTCTAATGAATCAATGGCTAGT^ATACACAGATCTT^AAAGCTAA 4732 
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SEQUENCE LISTING 

<110> Levanon Erez, et al . 

<120> METHODS AND SYSTEMS FOR IDENTIFYING NATURALLY OCCURRING ANTI SENSE 
TRANSCRIPTS AND METHODS AND KITS UTILIZING SAME 

<130> 02/25320 

<150> US 09/718,407 
<151> 2000-11-24 

<150> US 09/732, 938 
<151> 2000-12-11 

<150> US 09/785,439 
<151> 2001-02-20 

<150> US 09/907, 923 
<151> 2001-07-18 

<150> US 009/993,398 
<151> 2001-11-06 

<150> US 10/201, 605 
<151> 2002-7-24 

<160> 44 

<170> Patentln version 3.1 

<210> 1 

<211> 190 

<212> DNA 

<213> Homo sapiens 

<400> 1 

ggacccagga tatgagcgga aaacactttc tctacttaga tacaactttt tcctgtgcgc 60 
atgcctgtaa tcccagctac tcaggaggct gaggcaggag aatcccttga acccaggagg 120 



BNSDOCID: <WO 03046220A1..L> 



WO 03/046220 PCT/IL02/00904 

2 

cagaggttgc ggtgagccaa gatctcacca ttgcactcca gcctgggcaa taagaacaaa 180 

actccgtctc 190 

<210> 2 

<211> 783 

<212> DNA 

<213> Homo sapiens 

<400> 2 

gaaaaagttg tatctaagta gagaaagtgt tttccgctca tatcctgggt ccacatcgaa 60 

gaattcagtc cttgtggatg aactgtaaac agcacccttc ctctaagatg ccgaagatca 120 

tagtttgtgg tttttttctt tcaggcggtg gaagcagggc agagccgaag cagcccgctc 180 

ctcaagaggc cggtgcggac ccaggcggtg ctggaccagt cagatgtgta cacccatgtc 240 

ctgtcagcct tcgtggaaaa gaaggtgggc cgcagctttc cgcctcttct ggactgagaa 300 

tgctcaaaac aaggaagttg ctgaaaacga ggagacttca tgtgattaga gtcacttgaa 360 

gtgattagaa tcactggagt ttccttgggt gaggccctag agctggtagt ttggcttcta 4 20 

atgctgaggc ctaaagcata attgttgacg ggtggttctg gagcgatttg tgcaaaacca 480 

gtgaaagatg aacactgggc cattttaaga tggaaacaag gtgggggttg gatagagagt 54 0 

tatatgcagc ctcttttgca cctcgttggt atttgtaaga ccacattttt ttctccctag 600 

gagatgcctc ataaatttgt gatagccgtg ctgatggaat acattcgttc tcttaaccag 660 

tttcagattg cagtacagct atgtaactga gtaagacagg gagaaatatt aatccgtgag 720 

tggctcccag taagaccatg gccaaataca tcctgaagta gaatatctgg aaaatttgag 780 

att 783 

<210> 3 

<211> 1649 

<212> DNA 

<213> Homo sapiens 

<400> 3 

gaaaaagttg tatctaagta gagaaagtgt tttccgctca tatcctgggt ccacatcgaa 60 

gaattcagtc cttgtggatg aactgtaaac agcacccttc ctctaagatg ccgaagatca 120 

tagtttgtgg tttttttctt tcaggcggtg gaagcagggc agagccgaag cagcccgctc 180 

ctcaagaggc cggtgcggac ccaggcggtg ctggaccagt cagatgtgta cacccatgtc 24 0 

ctgtcagcct tcgtggaaaa gaaggtgggc cgcagctttc cgcctcttct ggactgagaa 300 

tgctcaaaac aaggaagttg ctgaaaacga ggagacttca tgtgattaga gtcacttgaa 360 

gtgattagaa tcactggagt ttccttgggt gaggccctag agctggtagt ttggcttcta 420 
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atgctgaggc 

gtgaaagatg 

tatatgcagc 

gagatgcctc 

tttcagattg 

aaaaaaaaac 

ctgactcaga 

agttggggtc 

tggttagaat 

tgttctttca 

cttgtccagc 

gactccaaac 

cagctatctc 

taactctgtt 

caacagcaaa 

taaggtttat 

ctgcaaagca 

gaaaccagcg 

ttgctttttt 

atcacttgct 

ctctttaaac 



ctaaagcata 

aacactgggc 

ctcttttgca 

ataaatttgt 

cagtacagcc 

acagtcactg 

atgctgagtg 

tttacctgca 

agceagtcct 

ctttttactt 

acaacctctt 

ctttggcttg 

tggacatgct 

cctgttgttt 

tgatgaaata 

ccggggcatt 

gactgaagac 

tttgcgaggg 

caaacagatt 

gtttttttat 

tataaaatgt 



attgttgacg 

cattttaaga 

cctcgttggt 

gatagccgtg 

ttcaaatcat 

tcttagaaga 

actcctgaca 

tgacgaaacc 

tggggagcct 

ttgtcctcag 

ttatatgctg 

tctgctgtta 

gaaggtaact 

gcactgacct 

gtagaagttc 

ggtggccatg 

aacatgcttt 

agccccaatt 

tttggagacc 

ataaaaatgt 

taaaaagtg 



ggtggttctg 

tggaaacaag 

atttgtaaga 

ctgatggaat 

ctgggcccaa 

tgactcatat 

ttattagttg 

acttcttgta 

ctagtctgtt 

cattacctac 

catcagttcc 

tccctagaga 

ctgatgtgtg 

ggacttctct 

tcctttccaa 

acaacatttc 

tctatacaat 

tcacaccagg 

aagctctaat 

gtacaaagtt 



gagcgatttg 

gtgggggttg 

ccacattttt 

acattcgttc 

gttaaaacag 

gctaagacag 

gaatgggaag 

atgacagact 

gtagctgaat 

atgaacttgt 

tgcagtacca 

gtttctatcc 

aggttttaga 

cccttactgc 

acaccaagtg 

tgcacgaaaa 

attccgcttt 

ggaacactgt 

gaggcctaca 

aatttattgc 



tgcaaaacca 

gatagagagt 

ttctccctag 

tcttaaccag 

aaggaattta 

gtctgcctcc 

tgtaaggtca 

tttactgtgt 

gatttggaag 

tatcaaaacc 

cgtcctcagc 

tcctgctcat 

ctatggaaac 

tagcgacttt 

ttagctgcct 

tttttagatg 

tttgaacagc 

gaagaacatg 

acattctgaa 

attaataaag 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1649 



<210> 4 

<211> 1861 

<212> DNA 

<213> Homo sapiens 



<400> 4 
gaaaaagttg 


tatctaagta 


gagaaagtgt 


tttccgctca 


tatcctgggt 


ccacatcgaa 


60 


gaattcagtc 


cttgtggatg 


aactgtaaac 


agcacccttc 


ctctaagatg 


ccgaagatca 


120 


tagtttgtgg 


tttttttctt 


tcaggcggtg 


gaagcagggc 


agagccgaag 


cagcccgctc 


180 


ctcaagaggc 


cggtgcggac 


ccaggcggtg 


ctggaccagt 


cagatgtgta 


cacccatgtc 


240 


ctgtcagcct 


tcgtggaaaa 


gaaggtgggc 


cgcagctttc 


cgcctcttct 


ggactgagaa 


300 


tgctcaaaac 


aaggaagttg 


ctgaaaacga 


ggagacttca 


tgtgattaga 


gtcacttgaa 


360 


gtgattagaa 


tcactggagt 


ttccttgggt 


gaggccctag 


agctggtagt 


ttggcttcta 


420 


atgctgaggc 


ctaaagcata 


attgttgacg ggtggttctg 


gagcgatttg 


tgcaaaacca 


480 


gtgaaagatg 


aacactgggc 


cattttaaga 


tggaaacaag 


gtgggggttg gatagagagt 


540 
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tatatgcagc 
gagatgcctc 
tttcagattg 
aaaaaaaaac 
ctgactcaga 
agttggggtc 
tggttagaat 
tgttctttca 
cttgtccagc 
gactccaaac 
cagctatctc 
taactctgtt 
caacagcaaa 
taaggtttat 
ctgcaaagca 
gaaaccagcg 
ttggggtaac 
ctgggttagc 
ttatttgtag 
ccgaggaggt 
cttatagcag 
gggttttaaa 
t 



ctcttttgca 
ataaatttgt 
cagtacagcc 
acagtcactg 
atgctgagtg 
tttacctgca 
agccagtcct 
ctttttactt 
acaacctctt 
ctttggcttg 
tggacatgct 
cctgttgttt 
tgatgaaata 
ccggggcatt 
gactgaagac 
tttgcgaggg 
catagcctca 
atttttgtaa 
tagagtgaat 
gtgttttgag 
atcttggaat 
gggtctgggg 



cctcgttggt 
gatagccgtg 
ttcaaatcat 
tcttagaaga 
actcctgaca 
tgacgaaacc 
tggggagcct 
ttgtcctoag 
ttatatgctg 
tctgctgtta 
gaaggtaact 
gcactgacct 
gtagaagttc 
ggtggccatg 
aacatgcttt 
agccccaatt 
aagagtagca 
acaacacaat 
tcagtatact 
tcaagacaca 
atctcttaaa 
cttattaagg 



atttgtaaga 
ctgatggaat 
ctgggcccaa 
tgactcatat 
ttattagttg 
acttcttgta 
ctagtctgtt 
cattacctac 
catcagttcc 
tccctagaga 
ctgatgtgtg 
ggacttctct 
tcctttccaa 
acaacatttc 
tctatacaat 
tcacaccagg 
gagggcactg 
ttgataacag 
gacagaatct 
tttaggaccc 
gccaggaata 
tttcagtttt 



ccacattttt 

acattcgttc 

gttaaaacag 

gctaagacag 

gaatgggaag 

atgacagact 

gtagctgaat 

atgaacttgt 

tgcagtacca 

gtttctatcc 

aggttttaga 

cccttactgc 

acaccaagtg 

tgcacgaaaa 

attccgcttt 

tgagaatgca 

gcagctggtg 

cccacctagc 

ggattatgct 

agatcaggca 

agacggcaaa 

atgaagtata 



ttctccctag 

tcttaaccag 

aaggaattta 

gtctgcctcc 

tgtaaggtca 

tttactgtgt 

gatttggaag 

tatcaaaacc 

cgtcctcagc 

tcctgctcat 

ctatggaaac 

tagcgacttt 

ttagctgcct 

tttttagatg 

tttgaacagc 

atgaaaagac 

ggcgaggacc 

ccttggccca 

ctggaactca 

cagcccatct 

tggtggctaa 

cattggttga 



600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1861 



<210> 5 

<211> 214 

<212> DNA 

<213> Homo sapiens 



<400> 5 

gtaagggaac tttggcgact tagtgcgatc actgggagaa ttgtagagtc cactggagag 60 

aaagaaaaat ggtcaaaaag agcccagaga gttcctgggg gaaaacacac cgcagcccag 120 

acctattcat aactgcacag ctggtacttc cagaggcaca tgcaccaggg gcacgtggtt 180 

ctctttgctg acaagattta ttaaaagaaa agag 214 

<210> 6 
<211> 1934 
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<212> DNA 

<213> Homo sapiens 

<400> 6 

aagtcaacga aaggttccgt tgtccttgac 
atagattatt ttgggctacg ttactgtgac 
gcaaaaaccc ttgctgaaca caaagaactg 
tttggtatta aattctatgc tgaagatcca 
cagtttttct tgcaggtgaa gcaagatgtc 
actgctgctc agctgggagc gtatgccatc 
aaacatactg caggatatgt atctgagtac 
gaagaagcca tagaaaggat tcataaaact 
ctgaattact tgaggactgc caaatccctg 
tatggagaaa acaagtctga gtatttctta 
aagaataaaa agcaagtggg gaagtatttc 
gagactcaat ttgaactcag agtactggga 
gaagctcgga gtaaaactgc ttgcaagcac 
ttttttagaa tgccagaaaa tgaatccaat 
tccatacgtt ataagcaccg ctacagtggc 
tctattcagc ttccccggcc tgatcagaat 
aagcgaatag cacaaacaca gccagctgaa 
atggaaaatg gagaaaatga aggaacaatt 
tttaagaaag caaagaatga aaatagecct 
tgggaagaaa atggccccca gagtggactc 
ccaaagttcc cttacacgcg tcgccgaaac 
cagcctgtga ggaggaggaa agcccataac 
aggaggtcac gttcacgctg taacaccagc 
gaacaccgga aaaagagaaa cagaatacgg 
cagtgggaag ctgtattaag gagacaaaag 
cgatccagac acagatctcg ttcgagaagc 
aagcacattc aaaaagaact tgtggatcca 
attccataca ctaaaataga gtgagtgcct 
tgcttgtgag taatccattc taattcttca 
tttacatttt aaccaaaact aggtgacagt 
gctacttatt ctacactata atcactatca 
aggtacaagg gggcttttcc tgattaatgt 



5 



cacgtattcc atcacgtaaa ccttgtggag 
agaagccatc agacgtattg gctggatcct 
atcaacactg gacctccata tactttgtat 
tgtaaactta aagaagaaat aaccagatat 
cttcagggcc gtctgccctg tcccgtcaac 
cagtcggagc ttggagatta tgacccatat 
cggtttgttc ctgatcagaa ggaagaactt 
ctaatgggtc agattccttc tgaggctgag 
gagatgtatg gcgttgacct ccatcccgtc 
ggattaactc cggttggtgt tgttgtgtac 
tggcctcgga ttacaaaggt tcacttcaag 
aaagattgta acgaaacctc attctttttt 
ctctggaagt gcagtgtgga acatcataca 
tcactgtcaa gaaaactcag caagtttgga 
aggacagctt tgcaaatgag ccgagatctt 
gtgacaagaa gtcgaagcaa gacttaccct 
tcaaacacca tcagtaggat aactgcaaac 
aaaattattg caccttcacc agtaaaaagc 
gatacccaaa gaagcaaatc tcatgcaccg 
tacaattctc ccagtgatcg cactaagtcg 
ccctcctgtg gaagtgacaa tgattctgta 
agtggtgaag attcagatct taagcaaagg 
agtggtagtg aatcagaaaa ttctaataga 
caggagaatg atatggttga ttcagcgcct 
gaaaaaaacc aagccgaccc caacagcagg 
cccgatatcc aagcaaaaga agagttatgg 
tccggattgt ccgaagaaca attaaaagag 
ttcagaatct tctcaccaaa gctttattag 
attgtgttcc agacagtgct ttaatttgtc 
agcgaaagag gaagaaaagt gtgcattaaa 
tctcttatta gccacctctt tgtacttggt 
cagttttaaa ataaattctt ttctgagatt 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
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ctcactgaaa aaat 

<210> 7 

<211> 2353 

<212> DNA 

<213> Homo sapiens 



<400> 7 
aagtcaacga 


aaggttccgt 


tgtccttgac 


cacgtattcc 


atcacgtaaa 


ccttgtggag 


60 


atagattatt 


ttgggctacg 


ttactgtgac 


agaagccatc 


agacgtattg 


gctggatcct 


120 


gcaaaaaccc 


ttgctgaaca 


caaagaactg 


atcaacactg 


gacctccata 


tactttgtat 


180 


tttggtatta 


aattctatgc 


tgaagatcca 


tgtaaactta 


aagaagaaat 


aaccagatat 


240 


cagtttttct 


tgcaggtgaa 


gcaagatgtc 


cttcagggcc 


gtctgccctg 


tcccgtcaac 


300 


actgctgctc 


agctgggagc 


gtatgccatc 


cagtcggagc 


ttggagatta 


tgacccatat 


360 


aaacatactg 


caggatatgt 


atctgagtac 


cggtttgttc 


ctgatcagaa 


ggaagaactt 


420 


gaagaagcca 


tagaaaggat 


tcataaaact 


ctaatgggtc 


agattccttc 


tgaggctgag 


480 


ctgaattact 


tgaggactgc 


caaatccctg 


gagatgtatg 


gcgttgacct 


ccatcccgtc 


540 


tatggagaaa 


acaagtctga 


gtatttctta 


ggattaactc 


cggttggtgt 


tgttgtgtac 


oUU 


aagaataaaa 


agcaagtggg 


gaagtatttc 


tggcctcgga 


ttacaaaggt 


tcacttcaag 


660 


gagactcaat 


ttgaactcag 


agtactggga 


aaagattgta 


acgaaacctc 


attctttttt 


720 


gaagctcgga 


gtaaaactgc 


ttgcaagcac 


ctctggaagt 


gcagtgtgga 


acatcataca 


Ton 


ttttttagaa 


tgccagaaaa 


tgaatccaat 


tcactgtcaa 


gaaaactcag 


caagtttgga 


840 


tccatacgtt 


ataagcaccg 


ctacagtggc 


aggacagctt 


tgcaaatgag 


ccgagatctt 


SOU 


tctattcagc 


ttccccggcc 


tgatcagaat 


gtgacaagaa 


gtcgaagcaa 


gacttaccct 




aagcgaatag 


cacaaacaca 


gccagctgaa 


tcaaacacca 


tcagtaggat aactgcaaac 


1020 


atggaaaatg 


gagaaaatga 


aggaacaatt aaaattattg caccttcacc agtaaaaagc 


1080 


tttaagaaag 


caaagaatga 


aaatagccct 


gatacccaaa 


gaagcaaatc 


tcatgcaccg 


1140 


tgggaagaaa 


atggccccca 


gagtggactc 


tacaattctc 


ccagtgatcg cactaagtcg 


1200 


ccaaagttcc 


cttacacgcg 


tcgccgaaac ccctcctgtg 


gaagtgacaa 


tgattctgta 


1260 


cagcctgtga 


ggaggaggaa 


agcccataac 


agtggtgaag 


attcagatct 


taagcaaagg 


1320 


aggaggtcac 


gttcacgctg 


taacaccagc agtggtagtg 


aatcagaaaa 


ttctaataga 


1380 


gaacaccgga 


aaaagagaaa 


cagaatacgg 


caggagaatg 


atatggttga 


ttcagcgcct 


1440 


cagtgggaag 


ctgtattaag 


gagacaaaag 


gaaaaaaacc 


aagccgaccc 


caacagcagg 


1500 


cgatccagac 


acagatctcg 


ttcgagaagc cccgatatcc 


aagcaaaaga 


agagttatgg 


1560 


aagcacattc 


aaaaagaact 


tgtggatcca 


tccggattgt 


ccgaagaaca 


attaaaagag 


1620 


attccataca 


ctaaaataga 


gacacaaggt gacccaatcc gcatcaggca ttctcattcg 


1680 
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ccacgaagtt 


accgccagta 


tcgcaggtcc 


cagtgttcag 


atggggagcg 


atcagttctc 


1740 


tcggaagtga 


attcaaaaac 


agatcttgta 


ccaccacttc 


cggtgaccca 


ttcttcggat 


1800 


gctcagggtt 


ctggggatgc 


tacagttcat 


cagagaagaa 


atgggtctaa 


agatagcctg 


1860 


atggaagaaa 


aacctcagac 


atctacaaac 


aacctggctg 


gaaaacacac 


agcaaaaaca 


1920 


ataaaaacta 


tacaagcttc 


ccgcctcaag 


acagagactt 


gatcctgatg 


aagggtcaag 


1980 


ggtaggggtg 


ggaaggttgt 


gtgcgccact 


ggtacttttg 


aaactgtgaa 


ataggtatct 


2040 


taattcaaat 


ctcagacctg 


caagtatttc 


ttcagcatga 


gaaaatacat 


tatcttttgc 


2100 


ttcttttttt 


tttttttttg 


agatgttatc 


actctgtcgc 


ccaggctgga 


gtgcagcggc 


2160 


accgtgtcag 


ctcaccgcag 


cctccactta 


ctgggttaag 


cgattctcct 


gtctcaggct 


2220 


accgagcagc 


tgggattaca 


ggcgtgcacc 


acaacacccg 


gctaattctt 


tttgtatttt 


2280 


tagtagagac 


agggctttgc 


catgttggag 


gctggtctcg 


aactcctgac 


ctcaagtgat 


2340 


ccgcctgcct 


cag 










2353 



<210> 8 

<211> 2500 

<212> DNA 

<213> Homo sapiens 



<400> 8 
gacatgggct 


gtttctgcgc 


tgttccggaa 


gaattttact 


gcgaagtttt 


gctcctggat 


60 


gaatccaagt 


taacccttac 


cacccagcag 


cagggcatca 


agaagtcaac 


gaaaggttcc 


120 


gttgtccttg 


accacgtatt 


ccatcacgta 


aaccttgtgg 


agatagatta 


ttttgggcta 


180 


cgttactgtg 


acagaagcca 


tcagacgtat 


tggctggatc 


ctgcaaaaac 


ccttgctgaa 


240 


cacaaagaac 


tgatcaacac 


tggacctcca 


tatactttgt 


attttggtat 


taaattctat 


300 


gctgaagatc 


catgtaaact 


taaagaagaa 


ataaccagat 


atcagttttt 


cttgcaggtg 


360 


aagcaagatg 


tccttcaggg 


ccgtctgccc 


tgtcccgtca 


acactgctgc 


tcagctggga 


420 


gcgtatgcca 


tccagtcgga 


gcttggagat 


tatgacccat 


ataaacatac 


tgcaggatat 


480 


gtatctgagt 


accggtttgt 


tcctgatcag 


aaggaagaac 


ttgaagaagc 


catagaaagg 


540 


attcataaaa 


ctctaatggg 


tcagattcct 


tctgaggctg 


agctgaatta 


cttgaggact 


600 


gccaaatccc 


tggagatgta 


tggcgttgac 


ctccatcccg 


tctatggaga 


aaacaagtct 


660 


gagtatttct 


taggattaac 


tccggttggt 


gttgttgtgt 


acaagaataa 


aaagcaagtg 


720 


gggaagtatt 


tctggcctcg 


gattacaaag 


gttcacttca 


aggagactca 


atttgaactc 


780 


agagtactgg 


gaaaagattg 


taacgaaacc 


tcattctttt 


ttgaagctcg 


gagtaaaact 


840 


gcttgcaagc 


acctctggaa 


gtgcagtgtg 


gaacatcata 


cattttttag 


aatgccagaa 


900 


aatgaatcca 


attcactgtc 


aagaaaactc 


agcaagtttg 


gatccatacg 


ttataagcac 


960 


cgctacagtg 


gcaggacagc 


tttgcaaatg 


agccgagatc 


tttctattca 


gcttccccgg 


1020 


cctgatcaga 


atgtgacaag 


aagtcgaagc 


aagacttacc 


ctaagcgaat 


agcacaaaca 


1080 
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cagccagctg 


aatcaaacac 


catcagtagg ataactgcaa acatggaaaa tggagaaaat 


1140 


gaaggaacaa 


ttaaaattat tgcaccttca ccagtaaaaa gctttaagaa agcaaagaat 


1200 


gaaaatagcc 


ctgataccca 


aagaagcaaa 


tctcatgcac 


cgtgggaaga aaatggcccc 


1260 


cagagtggac 


tctacaattc 


tcccagtgat 


cgcactaagt 


cgccaaagtt cccttacacg 


1320 


cgtcgccgaa 


acccctcctg tggaagtgac aatgattctg tacagcctgt gaggaggagg 


1380 


aaagcccata 


acagtggtga 


agattcagat 


cttaagcaaa 


ggaggaggtc acgttcacgc 


1440 


tgtaacacca 


gcagtggtag 


tgaatcagaa 


aattctaata 


gagaacaccg gaaaaagaga 


1500 


aacagaatac 


ggcaggagaa 


tgatatggtt gattcagcgc ctcagtggga agctgtatta 


1560 


aggagacaaa 


aggaaaaaaa 


ccaagccgac 


cccaacagca 


ggcgatccag acacagatct 


1620 


cgttcgagaa 


gccccgatat 


ccaagcaaaa 


gaagagttat 


ggaagcacat tcaaaaagaa 


1680 


cttgtggatc 


catccggatt 


gtccgaagaa 


caattaaaag 


agattccata cactaaaata 


1740 


gagtgagtgc 


ctttcagaat 


cttctcacca 


aagctttatt 


agtgcttgac acaaggtgac 


1800 


ccaatccgca 


tcaggcattc 


tcattcgcca 


cgaagttacc 


gccagtatcg caggtcccag 


1860 


tgttcagatg 


gggagcgatc 


agttctctcg 


gaagtgaatt 


caaaaacaga tcttgtacca 


1920 


ccacttccgg 


tgacccattc 


ttcggatgct 


cagggttctg 


gggatgctac agttcatcag 


1980 


agaagaaatg 


ggtctaaaga 


tagcctgatg 


gaagaaaaac 


ctcagacatc tacaaacaac 


2040 


ctggctggaa 


aacacacagc 


aaaaacaata 


aaaactatac 


aagcttcccg cctcaagaca 


2100 


gagacttgat 


cctgatgaag 


ggtcaagggt 


aggggtggga 


aggttgtgtg cgccactggt 


2160 


acttttgaaa 


ctgtgaaata 


ggtatcttaa 


ttcaaatctc 


agacctgcaa gtatttcttc 


2220 


agcatgagaa 


aatacattat 


cttttgcttc 


tttttttttt 


ttttttgaga tgttatcact 


2280 


ctgtcgccca 


ggctggagtg 


cagcggcacc 


gtgtcagctc 


accgcagcct ccacttactg 


2340 


ggttaagcga 


ttctcctgtc 


tcaggctacc gagcagctgg 


gattacaggc gtgcaccaca 


2400 


acacccggct 


aattcttttt 


gtatttttag 


tagagacagg 


gctttgccat gttggaggct 


2460 


ggtctcgaac 


tcctgacctc 


aagtgatccg 


cctgcctcag 




2500 



<210> 9 

<211> 947 

<212> DNA 

<213> Homo sapiens 



<400> 9 

gaaagatgat actaggtcag gaaatagcat ttgaaagtca ttctcatctg gagggatgaa 60 

gccaagataa ggcggaacca gggaaaagct ttaagaaagc aaagaatgaa aatagccctg 120 

atacccaaag aagcaaatct catgcaccgt gggaagaaaa tggcccccag agtggactct 180 

acaattctcc cagtgatcgc actaagtcgc caaagttccc ttacacgcgt cgccgaaacc 240 
cctcctgtgg aagtgacaat gattctgtac agcctgtgag gaggaggaaa gcccattaac 



300 
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agtggtgaag 


tttcagatct 


<~1 /-w j-tw f3 ?3 ft 

taaygCaaay 


ggagggaggt 


Lduy u> u. ^-.a >*-y 




360 


agcagtggta 


gtgaatcaga 


aaattCLaal. 


ananna pa^p 
ayciyacI^clC-*— 


yy aoaaaya^ 


aaacaaaata 


420 


cggcaggaga 


atgatatggt 


tgattcagcg 


ccLLay uy y y 


As»rfr*t"rf+'At*t* 
aayuLy La Lt 


aaaaaaaraa 

auyy ay wvaa 


480 


aaggaaaaaa 


accaagccga 


CCCCaaCagc 


arrnrrfn ^ j~ «~ o 
dyyLtjaLUUa 


y ava w o y « ^ w 


tcottcaaaa 


540 


agccccgata 


tccaagcaaa 


agaagag u uo 




l. l i*cxcici nay o 


acttataaat 


600 


ccatccggat 


tgtccgaaga 


acaattaaaa 


yagattccat 


ai* , a^faaafli' 


a n a <T"t~ n a rrt" n 
cty ay uy ay 


660 


cctttcagaa 


tcttctcacc 


aaagctttat 


tagtgcttgt 


gagtaatcca 


ttctaattct 


720 


tcaattgtgt 


tccagacagt 


gctttaattt 


gtctttacat 


tttaaccaaa 


actaggtgac 


780 


agtagcgaaa 


gaggaagaaa 


agtgtgcatt 


aaagctactt 


attctacact 


ataatcacta 


840 


tcatctctta 


ttagccacct 


ctttgtactt 


ggtaggtaca 


agggggcttt 


tcctgattaa 


900 


tgtcagtttt 


aaaataaatt 


cttttctgag 


attctcactg 


aaaaaat 




947 



<210> 10 

<211> 1366 

<212> DNA 

<213> Homo sapiens 



180 
240 
300 
360 



<400> 10 

gaaagatgat actaggtcag gaaatagcat ttgaaagtca ttctcatctg gagggatgaa 60 

gccaagataa ggcggaacca gggaaaagct ttaagaaagc aaagaatgaa aatagccctg 120 
atacccaaag aagcaaatct catgcaccgt gggaagaaaa tggcccccag agtggactct 
acaattctcc cagtgatcgc actaagtcgc caaagttccc ttacacgcgt cgccgaaacc 
cctcctgtgg aagtgacaat gattctgtac agcctgtgag gaggaggaaa gcccattaac 
agtggtgaag tttcagatct taaggcaaag ggagggaggt cacgttcacg ctgtaacacc 

agcagtggta gtgaatcaga aaattctaat agagaacacc ggaaaaagag aaacagaata 420 

cggcaggaga atgatatggt tgattcagcg cctcagtggg aagctgtatt aaggagacaa 4 80 

aaggaaaaaa accaagccga ccccaacagc aggcgatcca gacacagatc tcgttcgaga 54 0 

agccccgata tccaagcaaa agaagagtta tggaagcaca ttcaaaaaga acttgtggat 600 

ccatccggat tgtccgaaga acaattaaaa gagattccat acactaaaat agagacacaa 660 

ggtgacccaa tccgcatcag gcattctcat tcgccacgaa gttaccgcca gtatcgcagg 720 

tcccagtgtt cagatgggga gcgatcagtt ctctcggaag tgaattcaaa aacagatctt 780 

gtaccaccac ttccggtgac ccattcttcg gatgctcagg gttctgggga tgctacagtt 840 

catcagagaa gaaatgggtc taaagatagc ctgatggaag aaaaacctca gacatctaca 900 

aacaacctgg ctggaaaaca cacagcaaaa acaataaaaa ctatacaagc ttcccgcctc 960 

aagacagaga cttgatcctg atgaagggtc aagggtaggg gtgggaaggt tgtgtgcgcc 1020 

actggtactt ttgaaactgt gaaataggta tcttaattca aatctcagac ctgcaagtat 1080 

ttcttcagca tgagaaaata cattatcttt tgcttctttt tttttttttt ttgagatgtt 1140 
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atcactctgt cgcccaggct ggagtgcagc ggcaccgtgt cagctcaccg cagcctccac 
ttactgggtt aagcgattct cctgtctcag gctaccgagc agctgggatt acaggcgtgc 
accacaacac ccggctaatt ctttttgtat ttttagtaga gacagggctt tgccatgttg 
gaggctggtc tcgaactcct gacctcaagt gatccgcctg cctcag 



1200 
1260 
1320 
1366 



<210> 11 

<211> 422 

<212> DNA 

<213> Homo sapiens 



300 
360 



<400> 11 

aatcttcata atccccatgt gtcaaaggag agaccaggtg gaggtaactg aatcatgggg 60 

gtggtttccc caggctgttt ttgtgatagt gagtgagttc tcatgagatc tgatggtttt 120 

ataaggggct cttccctcct ttgcttgtga agaaggtgcc ttttttcccc tttgccttct 180 

gccatgattg taagtttcct gaggcctccc cagccaagct gaactgtgag tcaattaaac 240 

ctcttttctt cgtaaattac ccagtcttga gcagttcttt acagcagtgt gaaaacagag 

gaatacaccc atacatgcta ttctctgccc agaagccagg gggagcctgc cattaaaatg 

aaagtcactc cttgactcag aaccctcaaa tagctttcat ctcacccaga aaaaaaagaa 420 

<210> 12 

<211> 1532 

<212> DNA 

<213> Homo sapiens 



<400> 12 
aggtttctgc 


acaggaatat 


cgagagcgtc atgaacccga gctatagaga aaggagatga 


60 


ggcgtgagcc 


accgcacccg 


gctgacaagt gtccttctaa gaaacacaca gaggagaaga 


120 


cacagaagag 


gagagcacca 


tgtgatggta gacacagaaa ttggagttct acagccacaa 


180 


gccaaggaac 


tcctggagcc 


accaggagat ggaagatgca aagaactgat tttctctcag 


240 


agcctctgga 


gggagtgtgg ccctggtgac accttgattt tggacttctg gcctacagaa 


300 


ccatgcacac 


aggaggactt 


catttcccag gtctccttgc agtgaagttg aggccatgtg 


360 


actggtcttg 


ggccaatgga 


atgggtgcag aagggacaca gcccatttct agactcagcc 


420 


tgaaatgtcc 


tccataatcc 


ttactctttc tcccttcact cactggctgc aggaagctga 


480 


gaattatcct 


tggacttaca 


taaagcattt tggactttat gtaagtaaca acctgttgta 


540 


ttaagctact 


aagattttac ggttgtttgt taaatcagct aaccttaaac atcctaacaa 


600 


ctacaaatag 


aatacctgtt 


actgcataca taaaaataca aaaattagct ggatgtggtc 


660 
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ccacctgtag 


tcccagctac 


tcgggaggct 


11 

gaggcaggag 


aattgcttga 


acctgggagg 


720 


cggaggttgt 


ggtgagctga 


gatcgcacca 


ctgcactgca 


gcctgggcaa 


cagagcagga 


780 


ctctatctca 


aaaaaaaaac 


acaataaaca 


tttcttacct 


actgtagttt 


ttgtgggtca 


:-84 0 


ggaatctggg 


agcagcttag 


ttggatgatt 


tctgctcaca 


gtgttttatg 


aggttgcagt 


900 


caagatgttg 


gctggggctg 


tagtcatctg 


gagatttaac 


tacggctgga 


ggatccactt 


960 


caccatggtt 


cactcacctg 


gtgctggttg 


ctggcaggaa 


atttcagctc 


ttctcttata 


1020 


tggatctctt 


cacagattgc 


ttgagtgtcc 


tcaccgtatg 


gtgactggct 


tcctttacag 


1080 


aaatcagttg 


aagggaatgg 


gcaagtaaga 


aacagcaatg 


ctttttatga 


cctagtcctg 


1140 


aagttcccca 


ccattactta 


tgttcattgg 


aagccagttg 


ctaaggagag 


cctgcactca 


1200 


aagattgggg 


aaatagactt 


tatctttcaa 


agtgttgaag 


aatttgcaga 


cgtattttaa 


1260 


aaccaccaca 


caatccatca 


acacatcatg 


tcggctctat 


tcttgaaata 


gatccagaat 


1320 


ttgaccactt 


ttcaccatct 


ccattgctat 


tacccagatc 


taatcaacac 


catcacttgc 


1380 


rtanart" ana 




1" f a r* t" rrn <rr» t* 

l. iw c* \_ y y v_ l. 




d L_ O L L lay LL. 


i^a L LyOLaLg 


14 4 0 


atttggctgt 


gtccccaccc 


aaaatrfraf 

QdClGL L l_ d l_ 




A A +* **- +- 
aatuULuaLa 




1500 


gtcaaaggag 


agaccaggtg 


y uyy tau^uy 


aa 






1532 


<210> 13 














<211> 1753 












<212> DNA 














<213> Homo sapiens 












<400> 13 
tttcttaggg 


tttttttttg 


agttggagcc 


tcgctctgtc 


ccccaggctg 


gagtgcagtg 


60 


atgtgatctc 


ggctcactgc 


aacctctgcc 


tcccaggttc 


aagtgattct 


cctgcctcag 


120 


cctccctagt 


agctgcgact 


acaggcatgt 


gccaccatgc 


ctggctaacg 


ttttgtattt 


180 


ttgagtagag 


acagggtttc 


accatgttgg 


ccaggctatt 


ctcgaactcc 


tgacctcaag 


240 


tgatccacct 


gcctcggctt 


cccaaagttt 


ctgggattac 


aggcgtgagc 


caccgcaccc 


300 


ggctgacaag 


tgtccttcta 


agaaacacac 


agaggagaag 


acacagaaga 


ggagagcacc 


360 


atgtgatggt 


agacacagaa 


attggagttc 


tacagccaca 


agccaaggaa 


ctcctggagc 


420 


caccaggaga 


tggaagatgc 


aaagaactga 


ttttctctca 


gagcctctgg 


agggagtgtg 


480 


gccctggtga 


caccttgatt 


ttggacttct 


ggcctacaga 


accatgcaca 


caggaggact 


540 


tcatttccca 


ggtctccttg 


cagtgaagtt 


gaggccatgt 


gactggtctt 


gggccaatgg 


600 


aatgggtgca 


gaagggacac 


agcccatttc 


tagactcagc 


ctgaaatgtc 


ctccataatc 


660 


cttactcttt 


ctcccttcac 


tcactggctg 


caggaagctg 


agaattatcc 


ttggacttac 


720 


ataaagcatt 


ttggacttta 


tgtaagtaac 


aacctgttgt 


attaagctac 


taagatttta 


780 


cggttgtttg 


ttaaatcagc 


taaccttaaa 


catcctaaca 


actacaaata 


gaatacctgt 


840 


tactgcatac 


ataaaaatac 


aaaaattagc 


tggatgtggt 


cccacctgta 


gtcccagcta 


900 
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ctcgggaggc 


tgaggcagga gaattgcttg aacctgggag gcggaggttg tggtgagctg 


960 


agatcgcacc 


actgcactgc 


agcctgggca acagagcagg 


actctatctc aaaaaaaaaa 


1020 


cacaataaac 


atttcttacc tactgtagtt tttgtgggtc aggaatctgg gagcagctta 


1080 


gttggatgat 


ttctgctcac agtgttttat gaggttgcag tcaagatgtt ggctggggct 


1140 


gtagtcatct 


ggagatttaa 


ctacggctgg aggatccact 


tcaccatggt tcactcacct 


1200 


ggtgctggtt 


gctggcagga 


aatttcagct cttctcttat 


atggatctct tcacagattg 


1260 


cttgagtgtc 


ctcaccgtat 


ggtgactggc ttcctttaca 


gaaatcagtt gaagggaatg 


1320 


ggcaagtaag 


aaacagcaat 


gctttttatg acctagtcct 


gaagttcccc accattactt 


1380 


atgttcattg 


gaagccagtt 


gctaaggaga gcctgcactc 


aaagattggg gaaatagact 


1440 


ttatctttca 


aagtgttgaa 


gaatttgcag acgtatttta 


aaaccaccac acaatccatc 


1500 


aacacatcat 


gtcggctcta 


ttcttgaaat agatccagaa 


tttgaccact tttcaccatc 


1560 


tccattgcta 


ttacccagat 


ctaatcaaca ccatcacttg 


cctggactag agatttcctc 


1620 


ctcactgggc 


tctctgcttc 


tatctttagc ccattgctat 


gatttggctg tgtccccacc 


1680 


caaaatctca 


tcttgaatta 


taatcttcat aatccccatg 


tgtcaaagga gagaccaggt 


1740 


ggaggtaact 


gaa 






1753 



<210> 14 

<211> 1832 

<212> DNA 

<213> Homo sapiens 



<400> 14 

gggttttgcg ggtataatta cattcaggat ctcaggatac tgcattatct gtgtgacccc 60 

taaatctgat gacaagtgtc tgttttttgt ttttgttttt gagacagagc ctcgctctgt 120 

cacccaggct ggagtgctgt ggtgtgatct cggctcactg caacctccgc ctcccaggtt 180 

caagcaattc tctgcctcag cctcccgagt aaatgtgatt acaggcaggc gcctgccagc 240 

acacccagct gattttagta tttttagtag agatggggtt tcaccatctt ggccaggctg 300 

gtcttgaatt cctgacctcg tgatccaccc acttcagctt cccaaagttc tgggattaca 360 

ggcgtgagcc accgcacccg gctgacaagt gtccttctaa gaaacacaca gaggagaaga 420 

cacagaagag gagagcacca tgtgatggta gacacagaaa ttggagttct acagccacaa 4 80 

gccaaggaac tcctggagcc accaggagat ggaagatgca aagaactgat tttctctcag 540 

agcctctgga gggagtgtgg ccctggtgac accttgattt tggacttctg gcctacagaa 600 

ccatgcacac aggaggactt catttcccag gtctccttgc agtgaagttg aggccatgtg 660 

actggtcttg ggccaatgga atgggtgcag aagggacaca gcccatttct agactcagcc 720 

tgaaatgtcc tccataatcc ttactctttc tcccttcact cactggctgc aggaagctga 780 

gaattatcct tggacttaca taaagcattt tggactttat gtaagtaaca acctgttgta 84 0 
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ttaagctact 


aagattttac 


ggttgtttgt 


taaatcagct 


aaccttaaac 


atcctaacaa 


900 


ctacaaatag 


aatacctgtt 


actgcataca 


taaaaataca 


aaaattagct 


ggatgtggtc 


960 


ccacctgtag 


tcccagctac 


tcgggaggct 


gaggcaggag aattgcttga 


acctgggagg 


1020 


cggaggttgt 


ggtgagctga 


gatcgcacca 


cfc are a 1* af* a 




cagagcagga 


1080 


ctctatctca 


aaaaaaaaac 


acaataaaca 


tttcttaeet 


a. L- y Lay u. i_ u 


ttgtgggtca 


1140 


ggaatctggg 


agcagcttag 


ttggatgatt 


tctcfctcaca 


y ty ill. i_ a. \_ y 


aggttgcagt 


1200 


caagatgttg 


gctggggctg 


tagtcatctg 


aaaatttaar 


LaLy y l Lyy a 


ggatccactt 


1260 


caccatggtt 


cactcacctg 


gtgctggttg 


ctaacaaaaa 


atttcaactc 


ttctcttata 


1320 


tggatctctt 


cacagattgc 


ttgagtgtcc 


tcaccgtatg 


y LyauLyy v,l 


tcctttacag 


1380 


aaatcagttg 


aagggaatgg 


gcaagtaaga 


aacagcaatg 


ctttttatga 


cctagtcctg 


1440 


aagttcccca 


ccattactta 


tgttcattgg 


aagccagttg 


ctaaggagag 


cctgcactca 


1500 


aagattgggg 


aaatagactt 


tatctttcaa 


agtgttgaag 


aatttgcaga 


cgtattttaa 


1560 


aaccaccaca 


caatccatca 


acacatcatg 


tcggctctat 


tcttgaaata 


gatccagaat 


1620 


ttgaccactt 


ttcaccatct 


ccattgctat 


tacccagatc 


taatcaacac 


catcacttgc 


1680 


ctggactaga 


gatttcctcc 


tcactgggct 


ctctgcttct 


atctttagcc 


cattgctatg 


1740 


atttggctgt 


gtccccaccc 


aaaatctcat 


cttgaattat 


aatcttcata 


atccccatgt 


1800 


gtcaaaggag 


agaccaggtg 


gaggtaactg 


aa 






1832 



<210> 15 

<211> 10394 

<212> DNA 

<213> Homo sapiens 



<400> 15 
cgttgtttgg 


cgtgtttttt 


tttttgtttt 


ttgtcactgc 


ctgcctgggt 


cctgcccgag 


60 


gtctccatcc 


tcggtttccc 


tgtccttgcc 


ccgggccctg 


ggagtgctct 


ggaaggctgc 


120 


gcagtattgg 


aggggacaga 


atgaccttcc 


ggccttgagt 


ccctggggag 


cagatggacc 


180 


ctactggaag 


tcagttggat 


tcagatttct 


ctcagcaaga 


tactccttgc 


ctgataattg 


240 


aagattctca 


gcctgaaagc 


caggttctag 


aggatgattc 


tggttctcac 


ttcagtatgc 


300 


tatctcgaca 


ccttcctaat 


ctccagacgc 


acaaagaaaa 


tcctgtgttg 


gatgttgtgt 


360 


ccaatcctga 


acaaacagct 


ggagaagaac 


gaggagacgg 


taatagtggg 


ttcaatgaac 


420 


atttgaaaga 


aaacaaggtt 


gcagaccctg 


tggattcttc 


taacttggac 


acatgtggtt 


480 


ccatcagtca 


ggtcattgag 


cagttacctc 


agccaaacag 


gacaagcagt 


gttctgggaa 


540 


tgtcagtgga 


atctgctcct 


gctgtggagg 


aagagaaggg 


agaagagttg 


gaacagaagg 


600 


agaaagagaa 


ggaagaagat 


acttcaggca 


atactacaca 


ttcccttggt 


gctgaagata 


660 


ctgcctcatc 


acagttgggt 


tttggggttc 


tggaactctc ccagagccag gatgttgagg 


720 


aaaatactgt 


gccatatgaa 


gtggacaaag 


agcagctaca 


atcagtaacc 


accaactctg 


780 
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gttataccag gctgtctgat gtggatgcta atactgcaat taagcatgaa gaacagtcca 840 

acgaagatat ccccatagca gaacagtcca gcaaggacat ccctgtgaca gcacagccca 900 

gtaaggatgt acatgttgta aaagagcaaa atccaccacc tgcaaggtca gaggacatgc 960 

cttttagccc caaagcatct gttgctgcta tggaagcaaa agaacagttg tctgcacaag 1020 

aacttatgga aagtggactg cagattcaga agtcaccaga gcctgaggtt ttgtcaactc 1080 

aggaagactt gtttgaccag agcaataaaa cagtatcttc tgatggttgc tctactcctt 114 0 

caagggagga aggtgggtgt tctttggctt ccactcctgc caccactctg catctcctgc 1200 

agctctctgg tcagaggtcc cttgttcagg acagtctttc cacgaattct tcagatcttg 1260 

ttgctccttc tcctgatgct ttccgatcta ctccttttat cgttcctagc agtcccacag 1320 

agcaagaagg gagacaagat aagccaatgg acacgtcagt gttatctgaa gaaggaggag 1380 

agccttttca gaagaaactt caaagtggtg aaccagtgga gttagaaaac ccccctctcc 1440 

tgcctgagtc cactgtatca ccacaagcct caacaccaat atctcagagc acaccagtct 1500 

tccctcctgg gtcacttcct atcccatccc agcctcagtt ttctcatgac atttttattc 1560 

cttccccaag tctggaagaa caatcaaatg atgggaagaa agatggagat atgcatagtt 1620 

catctttgac agttgagtgt tctaaaactt cagagattga accaaagaat tcccctgagg 1680 

atcttgggct atctttgaca ggggattctt gcaagttgat gctttctaca agtgaatata 1740 

gtcagtcccc aaagatggag agcttgagtt ctcacagaat tgatgaagat ggagaaaaca 1800 

cacagattga ggatacggaa cccatgtctc cagttctcaa ttctaaattt gttcctgctg 18 60 

aaaatgatag tatcctgatg aatccagcac aggatggtga agtacaactg agtcagaatg 1920 

atgacaaaac aaagggagat gatacagaca ccagggatga cattagtatt ttagccactg 1980 

gttgcaaggg cagagaagaa acggtagcag aagatgtttg tattgatctc acttgtgatt 2040 

cggggagtca ggcagttccg tcaccagcta ctcgatctga ggcactttct agtgtgttag 2100 

atcaggagga agctatggaa attaaagaac accatccaga ggaggggtct tcagggtctg 2160 

aggtggaaga aatccctgag acaccttgtg aaagtcaagg agaggaactc aaagaagaaa 2220 

atatggagag tgttccgttg cacctttctc tgactgaaac tcagtcccaa gggttgtgtc 2280 

ttcaaaagga aatgccaaaa aaagaatgct cagaagctat ggaagttgaa accagtgtga 2340 

ttagtattga ttcccctcaa aagttggcaa tacttgacca agaattggaa cataaggaac 2400 

aggaagcttg ggaagaagct acttcagagg actccagtgt tgtcattgta gatgtgaaag 2460 

agccatctcc cagagttgat gtttcttgtg aacctttgga gggagtggag aagtgctcag 2520 

attcccagtc atgggaggat attgctccag aaatagaacc atgtgctgag aatagattag 2580 

acaccaagga agaaaagagt gtagaatatg aaggagatct gaaatcaggg actgcagaaa 2640 

cagaacctgt agagcaagat tcttcacagc cttccttacc tttagtgaga gcagatgatc 2700 

ctttaagact tgaccaggag ttgcagcagc cccaaactca ggagaaaaca agtaattcat 2760 

taacagaaga ctcaaaaatg gctaatgcaa agcagctaag ctcagatgca gaggcccaga 2820 

agctggggaa gccctctgcc catgcctcac aaagcttctg tgaaagttct agtgaaaccc 2880 
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catttcattt 


cactttgcct 


aaagaaggtg 


atatcatccc 


accattgact < 


ggtgcaaccc 




cacctcttat 


tgggcaccta 


aaattggagc 


ccaagagaca 


cagtactcct 


attggtatta 




gcaactatcc 


agaaagcacc 


atagcaacca 


gtgatgtcat 


gtctgaaagc 


atggtggaga 




cccatgatcc 


catacttggg 


agtggaaaag 


gggattctgg 


ggctgcccca gacgtggatg 




ataaattatg 


tctaagaatg 


aaactggtta 


gtcctgagac 


toaaacaaat 


gaagagtctt 


oloU 


tgcagttcaa 


cctggaaaag 


cctgcaactg 


gtgaaagaaa 


a a a ^ 3 3 w w 


actgctgttg 


loan 


ctgagtctgt 


tgccagtccc 


cagaagacca 


tgtctgtgtt 


naactahstc 


tataaacrcca 




ggcaagagaa 


tgaggctcga 


agtgaggatc 


cccccaccac 




gggaacttgc 




tccactttcc 


aagttctcaa 


ggagaagagg 


agaaagaaaa 




gaccatacaa 


O ^ £- 


tcaggcagag 


tcaacagcct 


atgaagccca 


ttagtcctgt 




gtttctcctg 




cttcccagaa 


gatggtcata 


caagggccat 


ccagtcctca 


0.uuauayyua 


ataataacaa 

a \jj y *^ ^ v.- 




atgtgctaga 


agaccagaaa 


gaaggacgga 


gtactaataa 


y^aaaci l-^^l 


agtaaggcct 




tgattgaaag gcccagccaa 


aataacatag 


gaatccaaac 


i_y y ay uy i_ 


uuu l. ^3«*y yy 


3obU 


tcccagaaac 


tgtttcagca 


gcaacccaga 


ctataaagaa 


L.y uy uy uyoy 


^ — • d \^ y y y a www 


o i o r\ 


gtacagtgga 


ccagaacttt 


ggaaagcaag 


atgccacagt 


LUayau uy ay 




j/oU 


gtgagaaacc 


agtcagtgct 


cctggggatg 


atacagagtc 




caaaaaaaao 


Jos U 


aagagtttga 


tatgcctcag 


cctccacatg 


gccatgtctt 


a r"*A t* fcrt" car* 


atgagaacaa 




tccgggaagt 


acgcacactt 


gtcactcgtg 


tcattacaga 


tatatattat 

u.y Ly u a i. i*a 




o you 


cagaagtaga 


aagaaaagta 


actgaggaga 


etgaagagcc 


aoLty uay ay 


tatcaaaaat 

\m y w y y tj 




I gtgaaactga 


agtttcccct 


tcacagactg 


ggggctcctc 


=» dot" oaccta 

uy y ^y avvwy 


gggga ta tea 




gctccttctc 


ctccaaggca 


tccagcttac 


accgcacatc 


day uyyy u^u 


rrf" ntctcan 

ay i>w wv ^ v».cavj 


A~\ AC\ 
4 X Q U 


ctatgcacag 


cagtggaagc 


tcagggaaag 


gagccggacc 


a v t^Qyayyy 


aaaaccagcg 




ggacagaacc 


cgcagatttt 


gccttaccca 


gctcccgagg 


aaocccaaaa 

CA y u w V* >»*■ £i y ^ 


aaactgagtc 




ctagaaaagg 


ggtcagtcag 


acagggacgc 


cagtgtgtga 


aaaoaataat 


aatacaqqcc 


4 J^U 


ttggcatcag 


acagggaggg 


aaggctccag 


tcacgcctcg 


tqqqcqtqqq 


cqaaqggqcc 

w 3 3 3 3 3 




gcccaccttc 


tcggaccact 


ggaaccagag 


aaacagctgt 


gcctggcccc 


ttqqqcatag 

v. 3 3 3 


y} A A n 

4 4 4 U 


aggacatttc acctaacttg 


tcaccagatg 


ataaatcctt 


cagccgtgtc 


ataccccaaq 




tgccagactc 


caccagacga 


acagatgtgg 


gtgctggtgc 


tttgcgtcgt: 


agtgactctc 




cagaaattcc 


tttccaggct 


gctgctggcc 


cttctgatgg 


cttagatgcc 


tcctctccag 


4620 


gaaatagctt 


tgtagggctc 


cgtgttgtag 


ccaagtggtc 


atccaatggc 


tacttttact 




ctgggaaaat 


cacacgagat 


gtcggagctg 


ggaagtataa 


attgctcttt 


gatgatgggt 




acgaatgtga 


tgtgttgggc 


aaagacattc 


tgttatgtga 


ccccatcccg 


ctggacactg 


4800 


aagtgacggc 


cctctcggag 


gatgagtatt 


tcagtgcagg 


agtggtgaaa 


ggacatagga 


4860 


aggagtctgg 


ggaactgtac 


tacagcattg 


aaaaagaagg 


ccaaagaaag 


tggtataagc 


4920 


gaatggctgt 


catcctgtcc 


ttggagcaag 


gaaacagact 


gagagagcag 


tatgggcttg 


4980 


gcccctatga 


agcagtaaca 


cctcttacaa 


aggcagcaga 


tatcagctta 


gacaatttgg 


5040 
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tggaagggaa gcggaaacgg 
gtagcagcag cacaacccct 
ttctctcagg caaaagaaaa 
gtcgcaagtc tgccacagta 
gtgagagtgg agacaacacc 
ctctcaacaa gaccttgttt 
acaagttggc cagccgctcc 
aggaattttt ggaaattcct 
gagctggcta tatccttgaa 
ttctaattgc ggatcagcat 
ttccttgtgt gtctcatgtc 
accgtaatta tctgttgcca 
aaccccgtga aaatcctttc 
acttcctgga gctctggtct 
accattcaag tgcccataac 
acccctcatg cccagcctcg 
cacaagagtg ggtgatccag 
caaaatataa acacgattat 
ctgctatcgt ggagattgtg 
tccttgcatg gatcttgtat 
gaatctcttt ggttgtagta 
aacaaatgac aagacatagt 
ggacagctca ccaaggaaat 
eaaaaggtta ttccagggtg 
tgaaggccca gattaacagt 
ctgacagtta aaaaagacct 
gtactcccca actcttagag 
gtgaccataa gcttgatgga 
acacctggaa atgttacacg 
acaatagctg gaagcagttc 
cattctctaa agcaggggtc 
tcaggctttg tgggccattt 
agcaaccata gacaatgagt 
aaacagacag caggccatag 
atctttagtt gataatagca 



16 

cgcagtaacg tcagctcccc 
acccgaaaga tcacagaaag 
cttatcactt ctgaagagga 
aaacctggtg cagtaggggc 
ggtgaaccct ctgccctgga 
ctgggctacg catttctcct 
aaactgccag atggtcctac 
cctttcaaca agcagtatac 
gatttcaatg aagcccagtg 
tgtcgaaccc ggaagtactt 
tgggtccatg atagttgcca 
gctgggtaca gccttgagga 
cagaatctga aggtactctt 
gagatcctca tgactggtgg 
aaagatattg ctttaggggt 
gtgctgaagt gtgctgaagc 
tgcctcattg ttggggagag 
gtttctcact aaagatactt 
ttttaaccag gttttaaatg 
atagttttat ttgctgaact 
actgggattt cttcatctgt 
actttctctg agtctttcaa 
tgaaaagtta agagtgaact 
tctaaaatgc tatgcttgca 
tgtgccaaaa gttgagtgga 
catgctctct ctctgagctg 
ctaaagggag aacgaaagga 
atgaccttcc gtaagataaa 
ttctagtcaa agacccaata 
cttcccttcc tctggcatca 
aacaaggttt ttttctgtaa 
gatccatcac aactactcgc 
aaacaaatgg gcacggctgt 
tttgccagct cctgctccag 
gggaataagt tgtcagagct 



agccacccct actgcctcca 
tcctcgtgcc tccatgggag 
acggtcccct gccaagcgag 
aggagagttt gtgagcccct 
agagcagaga gggcctttgc 
taccatggcc acaaccagtg 
aggaagcagt gaagaagagg 
agaatcccag cttcgagcag 
taacacagct taccagtgtc 
cctgtgcctt gccagtggga 
tgccaaccag ctccagaact 
gcaaagaatt ctggactggc 
ggtatcagac caacagcaga 
tgcagcctct gtgaagcagc 
atttgatgtg gtggtgacgg 
attgcagctg cctgtggtgt 
aattggattc aagcagcatc 
ggtcttactg gttttattcc 
tgtcttgtgt gtaactggat 
tttatgataa aataaatgtt 
ttttttgagc ttaatctcag 
caggcttatt cacttacgga 
ttattctgtg gcatcattcc 
gaaactcagt ttaaggtagg 
attgggcaca gctctgtttc 
agatcacagc tcacctgtgg 
ccaactgcca tgaagggaca 
catgggaagc acaagtgaga 
ttattattat tattattgtc 
ctgatccctg catggcttct 
agggtcaaag agtaaatatt 
ctttgctgtg agggcatgaa 
gtttcagtaa aactgtacaa 
agacagcagt ggaaagggtg 
tcccagtgtg tgtagaatat 
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5340 
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6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 
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gtagtgatga aaaccagatg 
cagtttggag ggcgttgttc 
ttaccagtga tggctgggcc 
ggtaattggg cctgcaatct 
ttaaggacct ttttcttgga 
tttcgctctt gttgcccagg 
gcttcccagg ttcaagcaat 
catgccacca cacctggcta 
gtcaggctgg tctcaaactc 
ctgggattac aggtgtgagc 
attagctaga attgcccaat 
cctatcttac ctgttgcttt 
acataattac tttttcacat 
gtttaccgct gtatctccca 
aatgactatt gaataaatga 
gccacatgtt ctccttgatg 
actcctggat gttcagtaac 
agatatttat ttgtgtgtgg 
aacgtggctg ttatttcaat 
gaaaaagagc cttattaagt 
aggataacct gtgatctaaa 
tgctacaata gcttagaaaa 
gtgattttct gaatgggaat 
tcagtttttt ttaaagttta 
gctagggacc agcaaggcgg 
agtggaaggc ccactgcctc 
tatctcactg tcaggttttt 
ctccattaaa attattaaag 
gggtatttgg actaaagtct 
cttaaaaagt ctactgagtt 
tgggtatagt atttgttata 
tgatgattcc gaacactgga 
tgcagagaga aggactcagc 
tgggggcgga gggaggagtt 
gccttcccac attcctattt 
catgacttcc tatgaaagta 



17 

cagtgactat aacctgatgc 
agtgaatatt tctttttact 
atattaagat aacttcaacc 
tcagtattta aaaatctaac 
gaataatact tttttttttt 
ctggaatgca atggcacaat 
tctcctgtct cagcctcctg 
atttttgtat ttttagtaga 
ctgacttcag gtgatccgcc 
caccatgccc ggcctaagaa 
ctgtgtaggt ataaattact 
cttacttggt ggtaacatcc 
atgaaccata aaatatttaa 
cagcttgaac agtaccaagg 
acatatccaa caaatgttct 
ggagagaccc ttccacatgg 
tgcttctagg agaaaaggta 
ctagaatggg atgttttgaa 
ttatgagcca gaaattttca 
gtcatgcttt cccaagacta 
taatgtcatc ttaaaactga 
aaatctgctt gcagacattt 
gacagacctc tgggaagcca 
gagttagaag gggtggtcgc 
ggtgcccacg gctgcacagc 
ccagcatagc aatacataac 
tagtatttta tgatgatgat 
atggtcacac ctctatctct 
tcttccagtt ctagaattct 
acccaagggt tgctcctacc 
atctagtcgt aacagtagtt 
gagaatcttg aacaggagtg 
tgtcattcca cttcagctca 
tcttggaaaa gccttgttca 
ctattagttc ctaaaatgac 
ctctttcatc agtaggaatt 



cagaacactg cattcttttt 
tacactgata tgaatattga 
cctatggttt gtgtaagatg 
aacttgatct caattttttc 
tttttttttt tgagacggaa 
ctcagctcac tgcagcgtct 
agtagctggg attacaggca 
atcgaggttt catcatgttg 
cgcctcggcc tcccaaagtg 
atacttttaa gtatattttc 
tggtataggg agagagaaag 
agcagttagt ctatttataa 
ctttctgctc tatattgttt 
tacgtagtag gtgctcaata 
caatgtaaag gatcagagat 
gaatgatggg aaggagttgt 
gagtcctatc actaagccgc 
tcttctgtta caaccttggg 
catcccgaaa ctacaaaaga 
ccttcaaaga aatatgaatc 
agagtttctt ttgactcttc 
tagagagaaa ggacaatgaa 
gctaccactg aatctcggta 
ctcctttcac agatgcggaa 
tagttcatat cagaattggg 
ctagcaaagg acttaacacc 
gacttctact agaaaataac 
aagccttact tataaaatga 
acaactcatt aaaaagccac 
tgcccagagt tccaccagcc 
gagccaaatc tgagttgatc 
aagactggcg gctaaagccc 
ccaactctcc atatggagga 
aaattctaca gaaccacctg 
ttgtaccaaa tccatacatg 
tagtagctgg tttccagtta 
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7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
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atgtattttg 

tctcagtgag 

gaacaaccca 

gcactgtgag 

ctttgtatct 

agatcatgaa 

caggcagacg 

ctaggttctg 

agtgaaacac 

gaagttaaag 

ccaaatctga 

aatgcgtaaa 

ccaaacctcc 

tgcctctgag 

tcaagtccca 

gtttcttgct 

ttgttactga 

agctccttaa 

gtaagcactc 



tcaagtactg 

atgggggtta 

atgcttattt 

gtggcacagt 

tatctctgta 

ttctccattg 

caccttcacg 

actgaccagc 

ctacccggga 

tgttaggaga 

caggatagac 

acaaatgcat 

aatctagcca 

ctgtgagtgt 

ctctctcagt 

tctttattat 

tttttgtatg 

gggcagagcc 

aata 



gggttgggga 

gttcaaggaa 

gatgggctga 

aattacctgc 

cgtgtgtgta 

gcaaaaccac 

agaatgctca 

gaacaaaaac 

aacagagttg 

acagtctgat 

actgccacgt 

cctttcctgg 

catttaactc 

tgttcccttt 

gaagcactcc 

ctgtactgtt 

tatatatata 

atgaattata 



18 

gaacccgttt 

gtaaggaggg 

ataaactatt 

ttcaaaatca 

ttgaggaaat 

ctctgtcctt 

gctgggcggc 

tgtgacagag 

gcattaggaa 

taatagctga 

gcaaggcctg 

ctaagcgagt 

ttcatttctt 

gcccgggatg 

cttccccact 

gtccacttgg 

tatatgtctt 

cctctttgta 



tgattacaag 

gggaggatgt 

caggactgaa 

actgatacca 

gctttactga 

tcggcaaggc 

tccacgctca 

atctaggatt 

aggaaggaag 

tctaattaat 

ccagcccctc 

attactctct 

agacccgcag 

ctcttgtttt 

atagccttta 

caattgttca 

gtttttccaa 

tccccagtgc 



cagataatta 

gaggaagtta 

ctatttttga 

acatttttat 

ctcagaggaa 

tgcatacttc 

tccagtgggc 

tcattcaggc 

gtacatccat 

agctgacctc 

agacgcacaa 

tagccctgca 

agtgtcttcc 

taataccagt 

gtgaaccctc 

ggcctctgtg 

ctagattgtg 

cttgcataca 



9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10394 



<210> 16 

<211> 6837 

<212> DNA 

<213> Homo sapiens 



<400> 16 

agcatcgagt cggccttgtt gcctactgga gtctccgcag agcccgggcg ggagtagctg 60 

gtggaccccg ttgagctgcc gaacttccgg gactcccccg cgaccccttc ccagcttccc 120 

gtccgctccg ccgcagcgat tgtctcggtg ggttgattcg gcacaaaccg cccgacccag 180 

gggccggtgc gcgtgtggaa ggggaagcac tcccctcgtg gtcgcctgga ggtgcgctgg 240 

aggagggggt gacataacca gggactcgag gtccgccgtg ggaatgatcc acgaactgct 300 

cttggctctg agcgggtacc ctgggtccat tttcacctgg aacaagcgga gtggcctgca 360 

ggtatcgcag gacttccctt tcctccaccc cagtgagacc agtgtcctga atcgactctg 4 20 

ccggctcggc acagactata ttcgcttcac tgagttcatt gaacagtaca cgggccatgt 4 80 

gcaacagcag gatcaccatc catctcaaca gggccaaggt gggttacatg gaatctacct 54 0 
gcgggccttc tgcacagggc tggattctgt tttgcagcct tatcgccaag cactgcttga 



600 
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tttggaacaa 


gagttcctgg 


gtgatcccca 


19 

tctctccata 


tcacatgtca 


actacttcct 


660 


agaccagttc 


cagcttcttt 


ttccctctgt 


gatggttgta 


gtagaacaaa 


ttaaaagtca 


720 


aaagattcat 


ggttgtcaaa 


tcctggaaac 


agtctacaaa 


cacagctgtg 


gggggttgcc 


780 


tcctgttcga 


agtgcactgg 


aaaaaatcct 


ggccgtttgt 


catggggtca 


tgtataaaca 


840 


gctctcagcc 


tggatgctcc 


atggactcct 


cttggaccag 


catgaagaat 


tctttatcaa 


900 


acaggggcca 


tcttctggta 


atgtcagtgc 


ccagccagaa 


gaggacgagg 


aggatctggg 


960 


cattggggga 


ctgacaggaa 


aacaactgag 


agaactgcag 


gacttgcgcc 


tgattgagga 


1020 


agagaacatg 


ctggcaccat 


ctctgaagca 


gttttcccta 


cgagtggaga 


ttttgccatc 


1080 


ctacattcca 


gtgagggttg 


ctgaaaaaat 


cctatttgtt 


ggagaatctg 


tccagatgtt 


1140 


tgagaatcaa 


aatgtgaacc 


tgactagaaa 


aggatccatt 


ttgaaaaacc 


aggaagacac 


1200 


ttttgctgca 


gagctgcacc 


gtctcaagca 


gcagccactc 


ttcagcttgg 


tggactttga 


1260 


acaggtggtg 


gatcgcattc 


gcagcactgt 


ggctgagcat 


ctctggaagt 


tgatggtaga 


1320 


agaatccgat 


ttactgggtc 


agctgaagat 


cattaaagac 


ttttaccttc 


tgggacgtgg 


1380 


agaactgttt 


caggccttca 


ttgacacagc 


tcaacacatg 


ttgaaaacac 


cacccactgc 


1440 


agtaactgag 


catgatgtga 


atgtggcctt 


tcaacagtca 


gcacacaagg 


tattgctaga 


1500 


tgatgacaac 


cttctccctc 


tgttgcactt 


gacaatcgag 


tatcacggaa 


aggagcacaa 


1560 


agcagatgct 


actcaggcaa 


gagaagggcc 


ttctcgggaa 


acttctcccc 


gggaagcccc 


1620 


tgcatctggc 


tgggcagccc 


taggtctttc 


ctacaaagta 


cagtggccac 


tacatattct 


1680 


cttcacccca 


gctgtcctgg 


aaaagtacaa 


tgttgttttt 


aagtacttac 


tgagtgtgcg 


1740 


ccgggtgcaa 


gctgagctgc 


agcactgctg 


ggccctacaa 


atgcagcgca 


agcacctcaa 


1800 


gtcgaaccag 


actgatgcaa 


tcaagtggcg 


cctaagaaat 


cacatggcat 


ttttggtgga 


1860 


taatcttcag 


tactatctcc 


aggtagatgt 


gttggagtct 


cagttctccc 


agctgcttca 


1920 


tcagatcaat 


tctacccgag 


actttgaaag 


catccgattg 


gctcatgacc 


acttcctgag 


1980 


caatttgctg 


gctcaatcct 


ttatcctatt 


gaaacctgtg 


tttcactgcc 


tgaatgaaat 


2040 


cctagatctc 


tgtcacagtt 


tttgtttgct 


ggtcagtcag 


aacctaggcc 


cactggatga 


2100 


gcgtggagcc 


gcccagctga 


gcattctcgt 


gaagggcttt 


agccgccagt 


cttcactcct 


2160 


gttcaagatt 


ctctccagtg 


ttcggaatca 


tcagatcaac 


tcagatttgg 


ctcaactact 


2220 


gttacgacta 


gattataaca 


aatactatac 


ccaggctggt 


ggaactctgg 


gcagtttcgg 


2280 


gatgtgaaaa 


tttctggctc 


ataaattgaa 


ataacagcca 


cgttcccaag 


gttgtaacag 


2340 


aagattcaaa 


acatcccatt 


ctagccacac 


acaaataaat 


atctgcggct 


tagtgatagg 


2400 


actctacctt 


ttctcctaga 


agcagttact 


gaacatccag 


gagtacaact 


ccttcccatc 


2460 


attcccatgt 


ggaagggtct 


ctcccatcaa 


ggagaacatg 


tggcatctct 


gatcctttac 


2520 


attgagaaca 


tttgttggat 


atgttcattt 


attcaatagt 


catttattga 


gcacctacta 


2580 


cgtaccttgg 


tactgttcaa 


gctgtgggag 


atacagcggt 


agacaaacaa 


tatagagcag 


2640 


aaagttaaat 


attttatggt 


tcatatgtga 


aaaagtaatt 


atgtttataa 


atagactaac 


2700 


tgctggatgt 


taccaccaag 


taagaaagca 


acaggtaaga 


taggctttct 


ctctccctat 


2760 
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accaagtaat 


ttatacctac 


acagattggg 


gtatttctta 


ggccgggcat 


ggtggctcac 


ggcgggcgga 


tcacctgaag 


tcaggagttt 


cgattctact 


aaaaatacaa 


aaattagcca 


tactcaggag 


gctgagacag 


gagaattgct 


tgagattgtg 


ccattgcatt 


ccagcctggg 


aaaaaaaaaa 


aaaaagtatt 


attctccaag 


agttgttaga 


tttttaaata 


ctgaagattg 


taggggttga 


agttatctta 


atatggccca 


tgtaagtaaa 


aagaaatatt 


cactgaacaa 


tctggcatca 


ggttatagtc 


actgcatctg 


gggaagctct 


gacaacttat 


tccctgctat 


gtctctggag 


caggagctgg 


caaactatgg 


aaacacagcc 


gtgcccattt 


gtttactcat 


aaaggcgagt 


agttgtgatg 


gatcaaatgg 


ccctttacag 


aaaaaaacct 


tgttgacccc 


tcagtgatgc 


cagaggaagg 


gaaggaactg 


ataatattgg 


gtctttgact 


agaacgtgta 


catgtttatc 


ttacggaagg 


tcattccatc 


ttggtccttt 


cgttctccct 


ttagctctaa 


atctcagctc 


agagagagag 


catgaggtct 


caattccact 


caacttttgg 


cacaactgtt 


tttctgcaag 


catagcattt 


tagacaccct 


ataaagttca 


ctcttaactt 


ttcaatttcc 


cctgttgaaa 


gactcagaga 


aagtactatg 


aaaaacagat 


gaagaaatcc 


cagttactac 


taaaagttca 


gcaaataaaa 


ctatatacaa 


gacacattta 


aaacctggtt 


aaaacacaat 


gaccaagtat 


ctttagtgag 


aaacataatc 


aattctctcc 


ccaacaatcra 


oacactaoat 


caatgcttca 


gcacacttca 


gcaccgaggc 


aaatacccct 


aaagcaatat 


ctgcaaggag 


aacttagccc 


tccattagaa 


agagagattt 


aaccaaagcc 


ctattatgtc 


aaacacactg 


atcactatgt 


gctgaccttg 


tagaaatatt 



20 

caattctagc taatgaaaat atacttaaaa 
acctgtaatc ccagcacttt gggaggccga 
gagaccagcc tgaccaacat gatgaaacct 
ggtgtggtgg catgtgcctg taatcccagc 
tgaacctggg aagcagacgc tgcagtgagc 
caacaagagc gaaattccgt ctcaaaaaaa 
aaaaaggtcc ttaagaaaaa attgagatca 
caggcccaat tacccatctt acacaaacca 
gccatcactg gtaatcaata ttcatatcag 
cgccctccaa actgaaaaag aatgcagtgt 
gttttcatca ctacatattc tacacacact 
tatcaactaa agatcaccct ttccactgct 
cctgctgtct gtttttgtac agttttactg 
tgtctatggt tgctttcatg ccctcacagc 
cccacaaagc ctgaaatatt tactctttga 
tgctttagag aatgagaagc catgcaggga 
cttccagcta ttgtgacaat aataataata 
acatttccag gtgttctcac ttgtgcttcc 
aagcttatgg tcactgtccc ttcatggcag 
gagttgggga gtacccacag gtgagctgtg 
tttttaactg tcaggaaaca gagctgtgcc 
aatctgggcc ttcacctacc ttaaactgag 
ggaataacct tttgggaatg atgccacaga 
ttggtgagct gtcctccgta agtgaataag 
tcttgtcatt tgttctgaga ttaagctcaa 
aaccaaagag attcaacatt tattttatca 
gatccatgca aggaatccag ttacacacaa 
ctccacgata gcagggaata aaaccagtaa 
gtgtttatat tttggatgct gcttgaatcc 
cacccactct tgtgacacca caggcagctg 
tgggcatgag gggtccgtca ccaccacatc 
caagggaaag tgaagaagga aaggacactc 
gattctaacc aatacatccc actctgcaca 
ctactgatca tgaccaaagg cagagttata 
taacaaatat acgtccagtg cttcacttat 
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2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 
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gttgactcac 


ctcttgaagg 
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tggtactttt cttctctaag 


aaacatggat 


acggtcaacc 


4920 


tattaggcct 


gagccttgga 


ccacaaggcc taacacctac 


aggtctaagg 


agatccctgg 


4980 


aacaaagaca 


ctacacacac 


tctttcaggt acctttgtta 


tgggcacttg 


aatggtgctg 


5040 


cttcacagag 


gctgcaccac 


cagtcatgag gatctcagac 


cagagctcca 


ggaagttctg 


5100 


ctgttggtct 


gataccaaga 


gtaccttcag attctggaaa 


ggattttcac 


ggggttgcct 


5160 


atgaaggaga 


caggaaagga 


ccttagcatg acaagtaata 


tccaacaaac 


tgcctttctg 


5220 


caaagggact 


catgtacatc 


tgaatgcttt caaaaataaa 


tgccccatca 


gacatagtgt 


5280 


ctcaagcctg 


taatcccagc 


actttgggag gctgtcgtgg 


ttggatctct 


tgggcctggg 


5340 


agttcgagac 


cagcctgggc aatgtggtga gaccccatct 


ctacaaaaga 


caacaaaaaa 


5400 


attagctggg 


u y ^yy **yy^y 


aotocctata atcccancaa 


cttgggaggc 


tgaggtaggg 


5460 


ggatcacttc 


V* y w w w U ^ u \J 


y Lu^ayijiwCVj Lay uaay toy 


tcactgcgcc 


actgtactcc 


5520 


agcctaggtg 


acaaaacaaa 




gccctatatt 


agggtccccc 


5580 


ttctcttcct 


tctttctata 


flatcra t*r*t*rrt" affrrf f nra 


ttcctggctt 


tctaatttcc 


5640 


atgtttgttc 




aataatcraa atratnrtTr 


tgagcctata 


tatttttaat 


5700 


gcttgcttaa 


aacttagttc 


t ctoactt t" s f^crrrt* trra r»» 
w Vp. i.y a i— i_ u i»c* oay y l. y ci y a. 


atattgaacc 


tatatacaaa 


5760 


tcttcacaca 


tttgcaaaag 


Qttcctaacc aatcttaacct 


agggaaataa 


actagataaa 


5820 


ctcctgaagt 


catttcaaac 


ccactcaaat ttatccrara 


gacattccaa 


tttctagaaa 


5880 


gctttactct 


ctcacctaga 


ttctcttccc tccaaagctt 


gctgtcctcc 


tgcctataca 


5940 


attctggatg 


ggcttcaaat 


acttaccagt ccagaattct 


ttgctcctca 


aggctgtacc 


6000 


cagctggcaa 


cagataatta 


cggtagttct ggagctggtt 


ggcatggcaa 


ctatcatgga 


6060 


cccagacatg 


agacacacaa 


ggaatcccac tggcaaggca 


caggaagtac 


ttccgggttc 


6120 


gacaatgctg 


atccgcaatt 


agaagacact ggtaagctgt 


gttacactgc 


aagaaaagaa 


6180 


gcagagccaa 


tgggtttggt 


gacttctgtg gaaagctcct 


aagcagcagc 


cataatgagc 


6240 


catgaagagc 


agatctgaag 


actcccaact actacccaaa 


atgtgattta 


gtctatcctg 


6300 


cccaaggcca 


ctcttctcac 


tggaaggccc aagtaatttc 


catagatgtt 


ctctctgcct 


6360 


cacctgcagc 


atactgagga 


cctaaatcct caacggacaa 


ccaaaaccta 


tgaactcagc 


6420 


ctttcaggct 


aaaaatcagc 


aaccctaata ggggttteta 


ctactaaaca 


taaacatcaa 


6480 


tcttcttttg 


tcccagcaac 


agaaccatag ccattaacta 


acccaaggtc 


ctaccttctc 


6540 


ttccctatac 


acaacaaaaa 


ttctatttca tgcaaaaaca 


ttttggcagt 


ttctcagttc 


6600 


ctgaaatctc 


tggctacttt 


atccaggttc cccaacccct 


cccaggcctc 


ttctcaacac 


6660 


agcaagttgg 


ctcttatcat 


tgccactata ttaggttaca 


caaagaaact 


cctcacctgg 


6720 


gcttcattga 


aatcttcaag 


gatatagcca gctcctgctc 


gaagctggga 


ttctgtatac 


6780 


tgcttgttga 


aaggaggaat 


ttccaaaaat tctatattaa 


aaaaaaaaac 


caagata 


6837 



<210> 17 
<2U> 733 
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<212> DNA 

<213> Artificial sequence 
<220> 

<223> Probe 
<400> 17 

cacaatctcc acgatagcag ggaataaaac cagtaagacc aagtatcttt agtgagaaac 



ataatcgtgt ttatattttg gatgctgctt gaatccaatt ctctccccaa caatgaggca 
ctggatcacc cactcttgtg acaccacagg cagctgcaat gcttcagcac acttcagcac 
cgaggctggg catgaggggt ccgtcaccac cacatcaaat acccctaaag caatatctgc 
aaggagcaag ggaaagtgaa gaaggaaagg acactcaact tagccctcca ttagaaagag 
agatttgatt ctaaccaata catcccactc tgcacaaacc aaagccctat tatgtcaaac 
acactgctac tgatcatgac caaaggcaga gttataatca ctatgtgctg accttgtaga 
aatatttaac aaatatacgt ccagtgcttc acttatgttg actcacctct tgaaggtggt 
acttttcttc tctaagaaac atggatacgg tcaacctatt aggcctgagc cttggaccac 54 0 
aaggcctaac acctacaggt ctaaggagat ccctggaaca aagacactac acacactctt 
tcaggtacct ttgttatggg cacttgaatg gtgctgcttc acagaggctg caccaccagt 
catgaggatc tcagaccaga gctccaggaa gttctgctgt tggtctgata ccaagagtac 
cttcagattc tgg 



<210> 18 

<211> 734 

<212> DNA 

<213> Artificial sequence 



60 
120 
180 
240 
300 
360 
420 
480 



600 
660 
720 
733 



<220> 

<223> Probe 

<400> 18 . 

gctagaattg cccaatctgt gtaggtataa attacttggt atagggagag agaaagccta 60 

tcttacctgt tgctttctta cttggtggta acatccagca gttagtctat ttataaacat 120 

aattactttt tcacatatga accataaaat atttaacttt ctgctctata ttgtttgtct 180 

accgctgtat ctcccacagc ttgaacagta ccaaggtacg tagtaggtgc tcaataaatg 240 

actattgaat aaatgaacat atccaacaaa tgttctcaat gtaaaggatc agagatgcca 300 

catgttctcc ttgatgggag agacccttcc acatgggaat gatgggaagg agttgtactc 360 

ctggatgttc agtaactgct tctaggagaa aaggtagagt cctatcacta agccgcagat 420 

atttatttgt gtgtggctag aatgggatgt tttgaatctt ctgttacaac cttgggaacg 480 

tggctgttat ttcaatttat gagccagaaa ttttcacatc ccgaaactgc ccagagttcc 540 
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600 



734 
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accagcctgg gtatagtatt tgttataatc tagtcgtaac agtagttgag ccaaatctga 
gttgatctga tgattccgaa cactggagag aatcttgaac aggagtgaag actggcggct 660 
aaagcccttc acgagaatgc tcagctgggc ggctccacgc tcatccagtg ggcctaggtt 720 
ctgactgacc agca 

<210> 19 

<211> 2289 

<212> DNA 

<213> Homo sapiens 



<400> 19 

tcgcggccgc gtcracgcgt ggtagggggc ccagagcaag ccgaaggcaa gcacgatggc 60 

gctcaccagc cggcccaccc gcgccccgtg ccgcccggag ccccagcggg cgccccgcag 120 

ccgtgccagc gtcacgctgt agcagccgag catcagccga aaggaagcac gaaagcggtc 180 

agagtctcca ggctcaggtg ggcggcggcg tggaccggcg acgggtggca cagctggcat 240 

acgcggtccc tccacaggtg gcggtagacg gcggccggga cggcgagcaa cagggcggcc 300 

agccagaccg ccagcagcag gcggcgggcc agggccgggc tgcgcagccg aggcgccagg 360 

aaggggcggg tgactgcgag gcagcgctgc aggctgagca ggccggtgag cagcacgctt 420 

ggcgtacatg ctgagcgcgc acacgtagta caccgccttg cagcccgcct ggcccagcgg 4 80 

ccaggcctgc cggtcaggaa ggccacaaag agcggcgtga gcagcagcac cgcgccgtcg 540 

gccagcgcca ggtgcagcac aagcgtggcc gccagcggtc gcccccgtgc aggctgccag 600 

cccgccaagc tccacaccac gaagccgttg ccaggcagcc ccagcagcgc cgccagcagc 660 

aggaaggctg tgcctgtggc ccgcgaagtc ttccagctca gcagtgtctc gttccctggg 720 

ggacggtagc agaccgacat ccttctgggc ctacaggaca cagaaaaaaa gtggggaagc 780 
tgggggaccc tacaaggatc cttggcagga aagcagggat tgtgttcatt ttgagggttt 
cactgtcagt gagagtctca gcttccatgc aactgtccat cacggctgca actgaaatca 
gagctgggac acagcgcacc agaagctaaa gtcttgatgc catcaaagga catcccctgc 
cccattcaca yattcacatc tctgtcacgt ccactaatcg gcaaaaggag aaaagtgaga 

gaagatgacc taagtgtgac tgcagcaggc agctctggaa aatgaagcca gagcagtgag 1080 

ceagcccctc ctccgaccaa ggaggaagga aagagcagcc ccagcacagg agagaaccac 1140 

ccagcccaga agttccaggg aaggaactct ccggtccacc atggagtacc tctcagctct 1200 

gaaccccagt gacttactca ggtcagtatc taatataagc tcggagtttg gacggagggt 1260 

ctggacctca gctccaccac cccagcgacc tttccgtgtc tgtgatcaca agcggaccat 1320 

ccggaaaggc ctgacagctg ccacccgcca ggagctgcta gccaaagcat tggagaccct 1380 

actgctgaat ggagtgctaa ccctggtgct agaggaggat ggaactgcag tggacagtga 14 40 

ggacttcttc cagctgctgg aggatgacac gtgcctgatg gtgttgcagt ctggtcagag 1500 

ctggagccct acaaggagtg gagtgctgtc atatgggcct ggacgggaga gccccaagca 1560 



840 
900 
960 
1020 
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cagcaaggac 


atcggccgat 


tcacctttga cgtgtacaag 


caaaaccctc 


gagacctctt 


1620 


tggcagcctg 


aatgtcaaag 


ccacattcta 


cgggctctac 


tctatgagtt 


gtgactttca 


1680 


aggacttggc 


ccaaagaaag 


tactcaggga 


gctccttcgt 


tggacctcca 


cactgctgca 


1740 


aggcctgggc 


catatgttgc 


tgggaatttc ctccaccctt 


cgtcatgcag 


tggagggggc 


1800 


tgagcagtgg 


cagcagaagg 


gccgcctcca 


ttcctactaa 


ggggctctga gcttctgccc 


1860 


ccagaatcat 


tccaaccgac 


ccactgcaaa 


gactatgaca 


gcatcaaatt 


tcaggacctg 


1920 


cagacagtac 


aggctagata 


acccacccaa 


tttccccact 


gtcctctgat 


cccctcgtga 


1980 


cacraaccttt 


caocataaco 


cctcacatcc caagtctata 


cccttacctg 


aagaatgctg 


2040 


ttctttccta 


gccacctttc 


tagcctccca 


cttgccctga 


aaggccaaga 


tcaagatgtc 


2100 


ccccaggcat 


cttgatccca 


gcctgactgc 


tgctacatct 


aatcccctac 


caatgcctcc 


2160 


tgtccctaaa 


ctccccagca 


tactgatgac 


agccctctct 


gactttacct 


tgagatctgt 


2220 


cttcataccc 


ttcccctcaa 


actaacaaaa 


acatttccaa 


taaaaatatc 


aaatatttac 


2280 


cgtcaaccc 












2289 



<210> 20 

<211> 1511 

<212> DNA 

<213> Homo sapiens 



<400> 20 
cacatttcat 


ccttttacat 


ggttcccatc 


taccctcaca 


acacatgtca 


tcaccaaaga 


60 


cacacataca 


agctccaatg 


gcttttgcca 


ggcaattctt 


cctccaggac 


cccatctggc 


120 


ccctccctca 


tccctcccct 


tggactttgc 


ccttcttact 


ggccaggcag 


gggggccaga 


180 


gtccaggctt 


gactcattcc 


caccttgtcc 


tgggctgaga 


tcccaggttt 


gtaacagaaa 


240 


acaccactaa 


agccccagca 


caggagagaa 


ccacccagcc 


cagaagttcc 


agggaaggaa 


300 


ctctccggtc 


caccatggag 


tacctctcag 


ctctgaaccc 


cagtgactta 


ctcaggtcag 


360 


tatctaatat 


aagctcggag 


tttggacgga 


gggtctggac 


ctcagctcca 


ccaccccagc 


420 


gacctttccg 


tgtctgtgat 


cacaagcgga 


ccatccggaa 


aggcctgaca 


gctgccaccc 


480 


gccaggagct 


gctagccaaa 


gcattggaga 


ccctactgct 


gaatggagtg 


ctaaccctgg 


540 


tgctagagga 


ggatggaact gcagtggaca gtgaggactt 


cttccagctg ctggaggatg 


600 


acacgtgcct 


gatggtgttg 


cagtctggtc agagctggag 


ccctacaagg 


agtggagtgc 


660 


tgtcatatgg 


cctgggacgg 


gagaggccca 


agcacagcaa 


ggacatcgcc 


cgattcacct 


720 


ttgacgtgta 


caagcaaaac 


cctcgagacc 


tctttggcag 


cctgaatgtc 


aaagccacat 


780 


tctacgggct 


ctactctatg 


agttgtgact 


ttcaaggact 


tggcccaaag 


aaagtactca 


840 


gggagctcct 


tcgttggacc 


tccacactgc tgcaaggcct 


gggccatatg 


ttgctgggaa 


900 


tttcctccac 


ccttcgtcat 


gcagtggagg 


gggctgagca 


gtggcagcag 


aagggccgcc 


960 
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tccattccta ctaaggggct 
caaagactat gacagcatca 
ccaatttccc cactgtcctc 
atcccaagtc tataccctta 
cccacttgcc ctgaaaggcc 
ctgctgctac atctaatccc 
tgacagccct ctctgacttt 
aaaaacattt ccaataaaaa 
accaggaaag ggatggggtg 
gatccaaagc c 

<210> 21 

<211> 6530 

<212> DNA 

<213> Homo sapiens 



ctgagcttct 
aatttcagga 
tgatcccctc 
cctgaagaat 
aagatcaaga 
ctaccaatgc 
accttgagat 
tatcaaatat 
gataccccat 



25 

gcccccagaa 
cctgcagaca 
gtgacagaac 
gctgttcttt 
tgtcccccag 
ctcctgtccc 
ctgtcttcat 
ttaccactaa 
tttgccctcc 



tcattccaac 
gtacaggcta 
ctttcagcat 
cctagccacc 
gcatcttgat 
taaactcccc 
acccttcccc 
gacttctgac 
cccatcaaca 



cgacccactg 
gataacccac 
aacgcctcac 
tttctggcct 
cccagcctga 
agcatactga 
tcaaactaac 
tccaatttaa 
cccagtccca 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1511 



<400> 21 

ttttgttagt ttgaggggaa gggtatgaag acagatctca aggtaaagtc agagagggct bu 

gtcatcagta tgctggggag tttagggaca ggaggcattg gtaggggatt agatgtagca 120 

gcagtcaggc tgggatcaag atgcctgggg gacatcttga tcttggcctt tcagggcaag 180 

tgggaggcca gaaaggtggc taggaaagaa cagcattctt caggtaaggg tatagacttg 240 

ggatgtgagg cgttatgctg aaaggttctg tcacgagggg atcagaggac agtggggaaa 300 

ttgggtgggt tatctagcct gtactgtctg caggtcctga aatttgatgc tgtcatagtc 360 

tttgcagtgg gtcggttgga atgattctgg gggcagaagc tcagagcccc ttagtaggaa 420 

tggaggcggc ccttctgctg ccactgctca gccccctcca ctgcatgacg aagggtggag 480 

gaaattccca gcaacatatg gcccaggcct tgcagcagtg tggaggtcca acgaaggagc 54 0 

tccctgaatg gcagagacaa gaggaaatca gatgatttgg aaaacttggg aggaagccat 600 

caagctggga gatgaggact ttccacaagc aagagctaac taggggtagg tgggtgcaag 660 

aggacgaatt atggggacta tccaactgta ggggatgggg cagtatgaca tgttgatttc 7 20 

tgacctgagt actttctttg ggccaagtcc ttgaaagtca caactcatag agtagagccc 780 

gtagaatgtg gctttgacat tcaggctgcc aaagaggtct cgagggtttt gcttgtacac 840 
gtcaaaggtg aatcgggcga tgtccttgct gtgcttgggc ctctcccgtc ccaggccata 
tgacagcact ccactctgta ggacaccctt gtcagtgcag tagatcctca taccagacac 
ccaccactaa tctccatcag cactgggtca gaccctccct cgcttggact ttctgtccac 
tgtgtgacat ccttgacaat tccacaactc ctcctgcacc tggtccccag gatcagggtt 

aagctagaga ggaagcccgg gaaagctcta aaggacaggc attggaagca gccccagtat 1140 

aggcctctta cccttgtagg gctccagctc tgaccagact gcaacaccat caggcacgtg 1200 



900 
960 
1020 
1080 
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tcatcctcca gcagctggaa gaagtcctca ctgtccactg cagttccatc ctcctctagc 1260 

accagggtta gcactccatt cagcagtagg gtctccaatg cctgcccaat ggcaagaagc 1320 

aagaagggca ggtcttatcc catgcccctt ccctctttag ctgcccaaca tccatcagtt 1380 

ggctctagac attggtcgat gtcccacttt gactttccgg cactttgata cctcctaaag 1440 

gttgcagctc tccgtgttct tcagtttttg ggggatccta gctagaggct gacctttttc 1500 

ctctttgctc ctaccatgtc attggcatct ccccttgctc ccctccaagt cacttctggt 1560 

ttggaattgg aaagcaagcc aggttctcac gaagtccacc cttctgtctt atctacaatg 1620 

ctgcacctca cttcccacac cctcaagagt tctccagaag tgttttcagt aatagtgttt 1680 

aacctttttg agtccttact ctgtgccagg tatgaggact ttacctacat tatcctctta 1740 

ctcctttcaa caaccctagg aggtgatgta ttattattgc ctttttatag ttgaagaaac 1800 

tgaggttttg gtaggttgaa caacttccca aggtttgaca ggcaggaagt ggcagaatca 1860 

gaatttgaac ttgatttgtc acacaaatca cctttccata ctagcttctg aattctgtcc 1920 

ctcgaactct ccctatctcc tgctaacccc tgctcccata gaaaagctca ctcggtggaa 1980 

aatgaacaaa ttgaccagag ctcattaggc ccactccgct gcttttagcc ctcagaggga 204 0 

ggggcagctg tgtgacttca gccctctgct ccatcatcac aagttgccac tgttgtggag 2100 

ccccttggct acccctgcta taggaaccga ggaacttggc ctacttactt tggctagcag 2160 

ctcctggcgg gtggcagctg tcaggccttt ccggatggtc cgcttgtgat cacagacacg 2220 

gaaaggtcgc tggggtggtg gagctgaggt ccagaccctc cgtccaaact ccgagcttat 2280 

attagatact gacctggtag ttgagaagaa aagtcaagaa ggggcgagga ggggcttggt 234 0 

gagtgtaaag ggcatgatga gggtagagtg gctagagggc tagggaggga gagatctagg 2400 

tttatcgatt agggatgagg gagagaccat ggagtgcagg tgggggcggg tggctcagga 24 60 

gcttgacaag cccactgtgg agtggggagc aggagaggaa ggggtactgg ttagtctcct 2520 

aggggctgag tggagtattg ttgccctgcc tatatcccct aaaggtggag ggtagagcgg 2580 

agggttagca gtcacctgag taagtcactg gggttcagag ctgagaggta ctccatggtg 2640 

gaccggagag ttccttccct ggaacttctg ggctgggtgg ttctctcctg tgctggggct 2700 

ttagtggtgt tttctgttac aaacctggga tctcagccca ggacaaggtg ggaatgagtc 27 60 

aagcctggac tctggccccc ctgcctggcc agtaagaagg gcaaagtcca aggggaggga 2820 

tgagggaggg gccagatggg gtcctggagg aagaattgcc tggcaaaagc cattggagct 2880 

tgtatgtgtg tctttggtga tgacatgtgt tgtgagggta gatgggaacc atgtaaaagg 294 0 

atgaaatgtg acttctggtg tttttttatt tctatggagg gaatttctgg ggacggtttc 3000 

tggctctcag gctctgagaa gctgcagttt atgagtggct ctgtgtgtgc tgccacctac 30 60 

tggagaagcc ataagctgca gctttaggaa aagggaaccc ggggcagagt gtggggaagt 3120 

gggatggcag catggcaggg ctttggaaaa tgagaggtga gactgtgtcc aggaagggtg 3180 

taaggagagg atggatcctg atacatggat tcaggatcat tagggtcctg tctgggacac 3240 

tggccttcct gcttacctgc tctttccttc ctccttggtc ggaggagggg ctggctcact 3300 
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gctctggctt 
ttttctcctt 
tgatggcatc 
ccgtgatgga 
cacaatccct 
tctgtgtcct 
tttcccctgc 
aaacactgaa 
ataccctgac 
tgtgaggata 
atcaccctgc 
tactgaaggg 
tcacttcctg 
aatacaggat 
taagacagtg 
gaggcctttc 
tcatttccaa 
actcatcctt 
ggttgtgtgt 
cctagcacca 
tggagtgact 
agccggacca 
ggtcacctct 
ggattggggc 
tcctcctatt 
cctggcaagg 
agggtgaggg 
cttgtcttac 
ctgcttaacc 
gatctgtgca 
gctggacaaa 
ggtgatgact 
gtcaactgac 
ctgggtcctc 
ggaggaggca 
cgccctctgt 



cattttccag 
ttgccgatta 
aagactttag 
cagttgcatg 
gctttcctgc 
gacaaagaaa 
aggcttgagc 
aaaaactggc 
ccccttgttt 
cgctgtagcc 
cagtcttttg 
acttaacata 
agttaacctc 
cacctgtacc 
ctcctggtgc 
ttttcccttc 
gttttgcttg 
tcctttcttg 
catgactccc 
tgccttacat 
ggggcagggc 
ggagactgga 
gcaggccaga 
cagagctggg 
tccttccacc 
caggctctca 
aggggcccca 
cctctgcaaa 
tttcagcttc 
cgtcccttta 
gaaggagggg 
ttatgcctgt 
tgggagcagg 
gtggaagtca 
tggcaccttc 
ggcgccttcc 



agctgcctgc 
gtggacgtga 
cttctggtgc 
gaagctgaga 
caaggatcct 
cacagagtaa 
ccaagccaga 
cctggccctg 
tggatatacc 
cactcattaa 
tcttgggcaa 
ctcttaatgg 
ccaaatacag 
caagccctta 
cctccccaag 
ccccagctct 
acttttccaa 
tcttgtatcc 
taaccattat 
ggaaaagctg 
ctgaggcaag 
aacaggcaag 
gagaccaggc 
ggagggatga 
accagggaat 
catgcctgga 
ggagaggccc 
tgtgataggc 
tccaggcccc 
caccccacca 
ccagactaga 
ttaccactga 
ggatctgggt 
ggactcccag 
tcatcgggca 
acccacctgt 
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tgcagtcaca cttaggtcat cttctctcac 3360 

cagagatgtg aatggggcag ggatgtcctt 3420 

gctgtgtccc agctctgatt tcagttgcag 3480 

ctctcactga cagtgaaacc ctcaaatgaa 354 0 

tgtagggtcc cccagcttcc ccactttttt 3600 

cttgattgcc ctgtgacctg gccagttgca 3660 

gccttgaaaa ggtattcagg ttgttgccca 3720 

aaccaaatac cttgaaccct cgtaaactcc 3780 

caggtagaac aactctctct cactgtctgt 3840 

gtacattctc ctaataaatg ctttggactg 3900 

tctatacttt tctcagaggt tcccaaggcc 3960 

ctttcctctc tcttgtttta ccttatgccc 4020 

gatcacctgt acccaagccc ttagctcaag 4080 

gctcaagctc tgctttggaa gaacccaaac 4140 

caacctcaag ttctggctgt tacttgagca 4200 

atccatctgc caggcccccc tcaaatctct 4260 

gaggagaggg ctgcttctta gtatgtccct 4320 

tggtgcagcc tggtaatggg gcctcttcat 4 380 

gcctccatgc atcccctgtt cctcctggaa 4 440 

tcattgacag cccggtgaga gccctgaggg 4 500 

aggtgggagg aggtaggagg ccaggggctc 4 560 

gataaggcag gtgggggact gagttgtttg 4 620 

aacatacaca ctgcagaagg tgggctggga 4 680 

gaacagaagc aggaccagga ttcagcagag 4740 

cttactgccc cacttcagct tgtgctgttt 4 800 

cgcctgggtg cgttggtgat gggaaggagc 4860 
aggatgagcc tcatcttgtc cctccccatt 4 920 
acaggacagg agtaggcacc tcgcctactg 4 980 
caatcctgct tgctcccagc ttggtaagta 504 0 
tccagttttg cccagatgtg ctagaatggg 5100 
ggagtggtgg tagagatagt gacagcctgg 5160 
gctctgggaa ggaggccagg agtggggcag 5220 
tccaagaagg agttgtgttt gaggtggggt 5280 
gcagaaaaga ggcaggctgc agggaagtaa 534 0 
tcacaggtgg ggttttgccc cacccctgaa 5400 
aggcccagaa ggatgtcggt ctgctaccgt 54 60 
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cccccaggga 

ctgctgctgg 

ggctggcggc 

gccgacggcg 

tggccgctgg 

gccagcgtgc 

ttcctggcgc 

ctggccgccc 

gtatgccagc 

ctgaccgctt 

cggctgcggg 

agcgccatcg 

caggcggtcg 

caggcggcgc 

ctctacgtct 

ctcttcgaag 

ctccgaacta 

ggtgggatgg 



acgagacact 

cggcgctgct 

ctgcacgggg 

cggtgctgct 

gccaggcggg 

tgctcaccgg 

ctcggctgcg 

tgttgctcgc 

tgtgccaccc 

tcgtgcttcc 

gcgcccgctg 

tgcttgcctt 

cagcgctggc 

gagcgggaac 

tcaccgctgg 

gctctgggga 

cccctcagct 

agaaggacgg 



gctgagctgg 

ggggctgcct 

gcgaccgctg 

gctcacgccg 

ctgcaaggcg 

cctgctcagc 

cagcccggcc 

cgtcccggcc 

gtcgccggtc 

tttcgggctg 

gggctccggg 

cggcttgctc 

tccaccggaa 

tacggccttg 

agatctgctg 

ggcccgaggg 

gaaagtggtg 

tccggaatgg 
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aagacttcgc 

ggcaacggct 

gcggccacgc 

ctctttgtgg 

gtgtactacg 

ctgcagcgct 

ctggcccgcc 

gccgtctacc 

cacgccgccg 

atgctcggct 

cggcacgggg 

tgggccccct 

ggggccttgg 

gccttcttca 

ccccgggcag 

ggcggccgct 

gggcagggcc 

gacctttgac 



gggccacagg 

tcgtggtgtg 

ttgtgctgca 

ccttcctgac 

tgtgcgcgct 

gcctcgcagt 

gcctgctgct 

gccacctgtg 

cccacctgag 

gctacagcgt 

cgcgggtggg 

accacgcagt 

cgaagctggg 

gttctagcgt 

gtccccgttt 

ctagggaagg 

gcggcaatgg 

agcagaccct 



cacagccttc 

gagcttggcg 

cctggcgctg 

ccggcaggcc 

cagcatgtac 

cacccgcccc 

ggcggtctgg 

gagggaccgc 

cctggagact 

gacgctggca 

ccggctggtg 

caaccttctg 

cggagccggc 

caacccggtg 

cctcacgcgg 

gaccatggag 

agacccgggg 



5520 

5580 

5640 

5700 

57 60 

5820 

5880 

5940 

6000 

6060 

6X20 

6180 

6240 

6300 

6360 

6420 

6480 

6530 



<210> 22 

<211> 424 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> Probe 
<400> 22 

ggattagatg tagcagcagt caggctggga tcaagatgcc tgggggacat cttgatcttg 60 

gcctttcagg gcaagtggga ggctagaaag gtggctagga aagaacagca ttcttcaggt 120 

aagggtatag acttgggatg tgaggcgtta tgctgaaagg ttctgtcacg aggggatcag 180 

aggacagtgg ggaaattggg tgggttatct agcctgtact gtctgcaggt cctgaaattt 240 

gatgctgtca tagtctttgc agtgggtcgg ttggaatgat tctgggggca gaagctcaga 300 

gccccttagt aggaatggag gcggcccttc tgctgccact gctcagcccc ctccactgca 360 

tgacgaaggg tggaggaaat tcccagcaac atatggccca ggccttgcag cagtgtggag 420 

gtcc 424 

<210> 23 
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<211> 424 
<212> DNA 

<213> Artificial sequence 



<220> 

<223> Probe 

<400> 23 - n 

ggacctccac actgctgcaa ggcctgggcc atatgttgct gggaatttcc tccacccttc 60 

gtcatgcagt ggagggggct gagcagtggc agcagaaggg ccgcctccat tcctactaag 120 
gggctctgag cttctgcccc cagaatcatt ccaaccgacc cactgcaaag actatgacag 
catcaaattt caggacctgc agacagtaca ggctagataa cccacccaat ttccccactg 
tcctctgatc ccctcgtgac agaacctttc agcataacgc ctcacatccc aagtctatac 
ccttacctga agaatgctgt tctttcctag ccacctttct agcctcccac ttgccctgaa 
aggccaagat caagatgtcc cccaggcatc ttgatcccag cctgactgct gctacatcta 420 

424 

atcc 

<210> 24 

<211> 7042 

<212> DNA 

<213> Homo sapiens 



180 
240 
300 
360 



<400> 24 

aagaagaggt agcgagtgga 


cgtgactgct 


ctatcccggg 


caaaagggat 


agaaccagag 


60 


gtggggagtc tgggcagtcg 


gcgacccgcg 


aagacttgag 


gtgccgcagc 


ggcatccgga 


120 


gtagcgccgg gctccctccg 


gggtgcagcc 


gccgtcgggg 


gaagggcgcc 


acaggccggg 


180 


aagacctcct ccctttgtgt 


ccagtagtgg 


ggtccaccgg 


agggcggccc 


gtgggccggg 


240 


cctcaccgcg gcgctccggg actgtggggt caggctgcgt 


tgggtggacg 


cccacctcgc 


300 


caaccttcgg aggtccctgg gggtcttcgt gcgccccggg 


gctgcagaga 


tccaggggag 


360 


gcgcctgtga ggcccggacc 


tgccccgggg 


cgaagggtat 


gtggcgagac 


agagccctgc 


420 


acccctaatt cccggtggaa 


aactcctgtt 


gccgtttccc 


tccaccggcc 


tggagtctcc 


480 


cagtcttgtc ccggcagtgc 


cgccctcccc 


actaagacct 


aggcgcaaag 


gcttggctca 


540 


tggttgacag ctcagagaga 


gaaagatctg 


agggaagatg 


gatgcaaaag 


ctcgaaattg 


600 


tttgcttcaa catagagaag 


ctctggaaaa 


ggacatcaag 


acatcctaca 


tcatggatca 


660 


catgattagt gatggatttt 


taacaatatc 


agaagaggaa 


aaagtaagaa 


atgagcccac 


720 


tcaacagcaa agagcagcta 


tgctgattaa 


aatgatactt 


aaaaaagata 


atgattccta 


780 


cgtatcattc tacaatgctc 


tactacatga 


aggatataaa 


gatcttgctg 


cccttctcca 


840 


tgatggcatt cctgttgtct 


cttcttccag tgtaaggaca gtcctgtgtg aaggtggagt 


900 
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accacagagg 
gctctccaaa 
gaagtctgta 
agggggagtg 
gcagaatctt 
tattgaagag 
cttgatcttg 
gattcttctt 
agtccctgtg 
taatatgaag 
ctctcccctt 
gtactacctc 
tgattatgag 
caaagattat 
gttatgtatt 
aaataagtct 
tcttcaagta 
gataatcact 
ctgtatgtat 
actttgtgct 
tgctcatctg 
agtcagtgag 
atttcctaat 
agctaagctg 
caaaaaaaac 
ccatgcctgc 
acaggtgttc 
agtgctttgt 
aaaagtgaag 
agagcaagtc 
gtcaagtgac 
gtttggtcat 
tagttgttca 
aagcattaat 
gatagtgaag 



ccagttgttt 
ttgaaaggtg 
ttagctgcag 
cattgggttt 
tgcacacggt 
gctaaagacc 
gatgatgttt 
acaaccagag 
gagagttcct 
aaggcagatt 
gtagtatctt 
aaacagcttc 
gctctagatg 
tacacagatc 
ctctgggaca 
cttttattct 
gattttctta 
cagtttcaga 
tggtacaact 
ttaatgtttt 
attcatgaat 
aattttcagg 
attgtacaac 
caggccaagc 
atcacgaatc 
ttttctgagg 
aaagctgaaa 
tgtgcattct 
atttggaatt 
aattgctgcc 
tgcttcctca 
acaaattcag 
gctgatggaa 
gtgaaacagt 
tgttgttcgt 



ttgtcacaag 
aaccaggatg 
aagctgttag 
cagttgggaa 
tggatcagga 
gtctccgcat 
gggactcttg 
acaagagtgt 
taggaaagga 
tgccagaaca 
taattggtgc 
agaataagca 
aagccatgtc 
tttccatcct 
tggaaactga 
gtgatcggaa 
cagagaagaa 
gatatcacca 
ttctggccta 
ccctggattg 
ttgtggaata 
agtttttatc 
tgggtctctg 
aggaggtcga 
tttcccgctt 
atggtcagag 
caggagagaa 
ctacagatga 
ctatgactgg 
atttcaccaa 
aactttggga 
tcaatcactg 
ccttaaagct 
tcttcctaaa 
ggtctgctga 
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gaagaagctg 
ggtcaccata 
agatcattcc 
acaagacaaa 
tgagagtttt 
tctgatgctt 
ggtgttgaaa 
tacagattca 
aaaaggactt 
agctcatagt 
acttttacgt 
gtttaagaga 
tataagtgtt 
tcagaaggac 
agaagttgaa 
tggaaagtcg 
ttgcagccag 
gccgcatact 
tcacatggcc 
gattaaagca 
cagacatata 
tttaaatgga 
tgagccggaa 
taatggaatg 
agttgtccgc 
aatagcttct 
acttctagaa 
cagatttata 
ggaactagta 
cagtagtcat 
tttgaatcaa 
cagattttca 
ttgggatgcg 
tttggaggac 
tggtgcaagg 



gtgaatgcaa 
catggaatgg 
cttttagaag 
tctgggcttc 
tcccagaggc 
cgcaaacacc 
gcttttgaca 
gtaatgggtc 
gaaattttat 
attataaaag 
gattttccca 
ataaggaaat 
gaaatgctca 
gttaaggtgc 
gacatactgc 
tttcgttatt 
cttcaggatc 
ctttcaccag 
agtgccaaga 
aaaacagaac 
ctagatgaaa 
caccttcttg 
acttcagaag 
ctttacctgg 
ccccacacag 
tgtggagctg 
atcaaggctc 
gcaacctgct 
cacacctatg 
catcttctct 
aaagaatgtc 
ccagatgata 
acatcagcaa 
cctcaagagg 
ataatggtgg 



ttcagcagaa 
caggctgtgg 
gttgtttccc 
tgatgaaact 
ttccacttaa 
caaggtctct 
gtcagtgtca 
ctaaatatgt 
ccctttttgt 
aatgtaaagg 
atcgctggga 
cttcgtctta 
gagaagacat 
ctacaaaggt 
aggagtttgt 
atttacatga 
tacataagaa 
atcaggaaga 
tgcacaagga 
ttgtaggccc 
aggattgtgc 
gacgacagcc 
tttatcagca 
aatggataaa 
atgctgttta 
ataaaacctt 
atgaggatga 
cagtggataa 
atgagcactc 
tagccactgg 
gaaataccat 
agcttttggc 
atgagaggaa 
atatggaagt 
cagcaaaaaa 



960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
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taaaatcttt 


ttgtggaata 


cagactcacg 


ttcaaaggtg 


gctgattgca gaggacattt 


3060 


aagttgggtt 


catggtgtga 


tgttttctcc 


tgatggatca 


tcatttttga catcttctga 


3120 


tgaccagaca 


atcaggctct 


gggagacaaa 


gaaagtatgt 


aagaactctg ctgtaatgtt 


3180 


aaagcaagaa 


gtagatgttg 


tgtttcaaga 


aaatgaagtg 


atggtccttg cagttgacca 


3240 


tataagacgt 


ctgcaactca 


ttaatggaag 


aacaggtcag 


attgattatc tgactgaagc 


3300 


tcaagttagc 


tgctgttgct 


taagtccaca 


tcttcagtac 


attgcatttg gagatgaaaa 


3360 


tggagccatt 


gagattttag 


aacttgtaaa 


caatagaatc 


ttccagtcca ggtttcagca 


3420 


caagaaaact 


gtatggcaca 


tccagttcac 


agccgatgag 


aagactctta tttcaagttc 


3480 


tgatgatgct 


gaaattcagg 


tatggaattg 


gcaattggac 


aaatgtatct ttctacgagg 


3540 


ccatcaggaa 


acagtgaaag 


actttagact 


cttgaaaaat 


tcaagactgc tttcttggtc 


3600 


atttgatgga 


acagtgaagg 


tatggaatat 


tattactgga 


aataaagaaa aagactttgt. 


3660 


ctgtcaccag 


ggtacagtac 


tttcttgtga 


catttctcac 


gatgctacca agttttcatc 


3720 


tacctctgct 


gacaagactg 


caaagatctg 


gagttttgat 


ctccttttgc cacttcatga 


3780 


attgaggggc 


cacaacggct 


gtgtgcgctg 


ctctgccttc 


tctgtggaca gtaccctgct 


3840 


ggcaacggga 


gatgacaatg gagaaatcag gatatggaat gtctcaaacg gtgagcttct 


3900 


tcatttgtgt 


gctccgcttt 


cagaagaagg 


agctgctacc 


catggaggct gggtgactga 


3960 


cctttgcttt 


tctccagatg 


gcaaaatgct 


tatctctgct 


ggaggatata ttaagtggtg 


4020 


gaacgttgtc 


actggggaat 


cctcacagac 


cttctacaca 


aatggaacca atcttaagaa 


4080 


aatacacgtg 


tcccctgact 


tcaaaacata 


tgtgactgtg 


gataatcttg gtattttata 


4140 


tattttacag 


actttagaat 


aaaatagtta 


agcattaatg 


tagttgaact ttttaaattt 


4200 


ttgaattgga 


aaaaaattct 


aatgaaaccc 


tgatatcaac 


tttttataaa gctcttaatt 


4260 


gttgtgcagt 


attgcattca 


ttacaaaagt 


gtttgtggtt 


ggatgaataa tattaatgta 


4320 


gctttttccc 


aaatgaacat 


acctttaatc 


ttgtttttca 


tgatcatcat taacagtttg 


4380 


tccttaggat 


gcaaatgaaa 


atgtgaatac 


ataccttgtt 


gtactgttgg taaaattctg 


4440 


tcttgatgca 


ttcaaaatgg 


ttgacataat 


taatgagaag 


aatttggaag aaattggtat 


4500 


tttaatactg 


tctgtattta 


ttactgttat 


gcaggctgtg 


cctcagggta gcagtggcct 


4560 


gctttttgaa 


ccacacttac 


cccaaggggg 


ttttgttctc 


ctaaatacaa tcttagaggt 


4 620 


tttttgcact 


ctttaaattt 


gctttaaaaa 


tattgtgtct gtgtgcatag tctgcagcat 


4680 


ttcctttaat 


tgactcaata 


agtgagtctt 


ggatttagca 


ggccccccca cctttttttt 


4740 


ttgtttttgg 


agacagagtc 


ttgctttgtt 


gccaggctgg 


agtgcagtgg cgcgatctcg 


4800 


gctcaccaca 


atcgctgcct 


cctgggttca 


agcaattctc 


ctgcctcagc ctcccgagta 


4860 


gctgggacta 


caggtgtgcg 


cacatgccag 


gctaattttt 


gtatttttag tagagacggg 


4920 


gtttcaccat 


gttggccggg 


atggtctcga 


tctcttgacc 


tcatgatcta cccgccttgg 


4980 


cctcccaaag 


tgctgagatt 


acaggcgtga 


gccaccgtgc 


ctggccaggc cccttctctt 


5040 


ttaatggaga 


cagggtcttg 


cactatcacc 


caggctggag 


tgcagtggca taatcatacc 


5100 


tcattgcagc 


ctcagactcc 


tgggttcaag 


caatcctcct 


gcctcagcct cccaagtagc 


5160 
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tgagactgca ggcacgagcc accacaccca 
ggtctcacta tgttgtctag gctggtcttg 
cagcctccca aagtgttggg attgcagata 
ttttgtgaag taaaacttgt atgttggaaa 
actgtagctg ctggcagccc tgtgccatat 
acactattcc tgctccctct tgtttcttac 
atctttccta atcctcactt ttttcttttt 
agtcattgag gtggggccaa ttttaatcat 
gaaatagaac aattttcatc taattccatt 
attcttttaa tgaatttcaa gagaattctc 
tgtaactcta gaagattaac cttccagcca 
cctcttttcc ttccttcttt cctttctctt 
agcttttgac aggggaaaaa actcaataac 
ttagttgaag cgtaaatcta aagaaacatt 
agataaatta atagtagatg tggttcccag 
tgtttctgta actggaacta aatcaaatga 
ttattgttgg tgcatattag tataactgtg 
atcactcaga tgtattttag ataagctatt 
atacaatcct ttgcattgtt aaggaggttt 
gtttacaggc ttactgtgat ttaagcaaat 
gaaatttctg taaatggtat gtctccttag 
actgttaggg gctcatctca tgtaggcaga 
ccactgactg ttataaagta taacaacaca 
ttgtcttaaa aagaaattag gagccaggtg 
tgggaggctg agacaggagg attccttgag 
tagcaagacc ctgtcttaaa agaaaaatgg 
gagataccta gtatgatgga gctgcaaatt 
aggattttgt tttgtagttt gcagatgagc 
ttttgtatta taaattacat tggacttcat 
cttgtatgtt ttgaaactct tgtatttatg 
atacatttta aaatatgaat ttaaaaaatt 
tgaaaaacaa aaaaaaaaaa aa 

<210> 25 
<211> 3019 
<212> DNA 
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gctaattttt aagttttctt gtagagacag 
aactcttggc ctcaagtaat cctcctgcct 
tgagccactg gcctggcctt cagcagttct 
gagtagattt tattggtcta cccttttctc 
ctggactcta gttgtcagta tctgagttgg 
atatcagact tcttacttga atgaaacctg 
taaaaagcag tttctccact gctaaatgtt 
aagccttaat aagatttttc taagaaatgt 
tacttttaga tgaatggcat tgtgaatgcc 
tggttttctg tgtaattcca gatgagtcac 
acctattttc ctttcccttg tctctctcat 
cttttatctc caaggttaat caggaaaaat 
tagctatttt tgacctcctg atcaggaact 
ttctctgaaa tatattatta agggcaatgg 
aaaatataat caaaattcaa agattttttt 
ttactagtgt taatagtaga taacttgttt 
gggtaggtcg gggagagggt aagggaatag 
tagcctttga tggaatcata aatacagtga 
tttgttttta aatggtgggt caaggagcta 
gtgaaaagtg aaaccttaat tttatcaaaa 
aatacccaaa tcataatttt atttgtacac 
gtataaagta ttaccttttg gaattaaaag 
catcaggttt taaaaagcct tgaatggccc 
cggtggcacg tgcctgtagt cccagctcct 
ccctggagtt tgagtccagc ctgggtgaca 
gaagaaagac aaggtaacat gaagaaagaa 
tcatggcagt tcatgcagtc ggtcaagagg 
atttctaaag cattttccct tgctgtattt 
atatataatt tttttttaca ttatatgtct 
atatagctta tatgattttt ttgccttggt 
tttgt^aaaa taaaattcac aaaattgttt 
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5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7042 
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<220> 

<223> Probe 
<220> 

<221> misc__f eature 

<222> (2846) . . (2846) 

<223> Any nucleotide 



<400> 25 
tttttttttt 


tttttttgaa 


aaacattttt 


ggattgtttc 


attctttgct 


tgtcatttat 


60 


ctgttgatta 


gaccactaaa 


gtgaaggatt 


caagctaaat 


acatcaacct 


ttctatttag 


120 


gctttatcag 


ctatatgtaa 


attcaattct 


atcaaaattt 


tctgagtgcc 


tcctcagtgt 


180 


gtctctctga 


tggttcctgc 


ccggtatggc 


tggcatgaag 


aagatccacg 


gacttgcgaa 


240 


tgctaacgcg 


gggcttgggg 


atgggtttgg 


agggtttgtt 


ttcaaagctt 


tctggaagtg 


300 


tggaggagtg 


tccccctttt 


cttgcttgta 


gtgctagctg 


gtaagcgact 


tcgaatgcct 


360 


gtcccagggt 


taggatgatt 


tcataggcta 


aattcacatc 


aaaggcagta 


aacacatgac 


420 


agtagtggtg 


attagacttc 


aaatcttttg 


tgatataggc 


aaatgttgag 


aggtcttctg 


480 


ggtcctgggc 


agcacaggag 


atattacgaa 


tttcatgctc 


agcaattatg 


ttcttatttg 


540 


ttgcatcaat 


aaatttgact 


cctttatatg 


agacagaaag 


aataatagta 


gggaccttct 


600 


tcatttgctc 


tgtagacttc 


tgacagttag 


cccgcatttt 


tgcacaagca 


tcttgggttg 


660 




W w w t-* KA \A W W W W 


tttatcaaca 

W W W W w U 


tagaacct a a 


ataaaaagct 


ttgtaatcac 


720 


acgactggaa 


gataagcttt 


tctgggtgat 


gctgccagta 


ctgtaccggg 


gtagaggctg 


780 


tggcttcatt 


cggaggtcgc 


aaggtaatgg 


aaggttctcc 


ccagtctcct 


gtctgagcca 


840 


tctgcctctc 


cagttttgat 


cggggaatat 


catcaaagta 


gttttcattt 


cttctcctcc 


900 


ttgcatcgcc 


ctgcatgata 


atgtgaggaa 


cgtctaggga 


gccaccagtg 


gtgtaagtgc 


960 


tttggctaag 


tgatggagac 


aactgaggag 


gagtgtgatt 


accactgggt 


tccctgaggg 


1020 


tgatggaccg 


agggggcttc 


tgtgggggat 


cgtcgtgcag 


cctgtctccc 


agagatgcca 


1080 


aaatacgttt 


cctgtggcca 


atcaaattga 


tttttaaaac 


attaataagt 


tcaacctccc 


1140 


agattttttt 


caacaggtcc 


atcgaagtgt 


agccattaat 


tagaaaggct 


ttggtgtagt 


1200 


cgcccagttc 


aatggaatcc 


agccactcag 


ctacagaggt 


gggatggtag 


ccatcatgcc 


1260 


caatgggtct 


catctttgga 


aggagctgga 


ttgcctgtag 


aattctttgt 


ctgtgcccag 


1320 


aattaaggat 


tccaatttcc 


aacaaatcct 


gatcttccat 


aacattgctt 


cccataaact 


1380 


gcacattgtc 


aaatccatta 


gccatcaggt 


ggttctcgta 


ctgaggtagc 


ccaatgcttt 


1440 


ccaaccattg 


tcccactgtt 


tggacagggc 


atctgggtct 


tgtggtctca 


ccattcatct 


1500 


ctttaagttc 


gttgttgatt 


ccaacatcta 


tggaactcat 


tattttgtca 


atttcttccc 


1560 
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attccgatgt 

aagattcact 

aatcagattt 

ttattctccc 

ccccattccg 

ggttgctttt 

aaggttctga 

tgacttcatc 

caggggactc 

attcttttgg 

cattcttgac 

taaagcttgt 

ggtgaaaatc 

ttacaatatt 

ttgagctgtt 

cagttcttgg 

tgtccatgag 

atggtacaat 

ggaagccaag 

cacacgtatt 

atgaggttct 

aattatctaa 

cttgacagag 

tggtcttttg 

cctgtacagg 



gaaggatggt 

ccagttaact 

agagacactt 

cagagtacag 

agatccactc 

tttgtgaaaa 

gttattttct 

ttgtcctttt 

acaggctgga 

gggatcattg 

ttctgtggtt 

gcaccctgta 

cagagaagac 

tttgagggca 

tctatggtta 

ggaaggtgcc 

atcacacaga 

ttccatagtg 

gttcctacat 

ttcatcatca 

tacactctgg 

gtagtggtct 

ttttatttca 

ggaaggagac 

ctcttcgag 



gttctttcag 

cttgatgttt 

tttgacaaat 

gctctctcca 

ctggttgatc 

attgttctac 

attggagact 

tcacattgtt 

gaggatccat 

tcatcctgtc 

cccacagaag 

gatgtgttga 

acaatggatg 

gtatcagggg 

ctagttcctg 

cttgcaattt 

aagttctcat 

taat-ttctct 

ccattacacg 

tcctcttctt 

cttccatttt 

gatatngtgt 

tccaagagtt 

tcaacaggag 



34 

aattcccttt 
tctcattgga 
gcatgtcgat 
caaatccccc 
ttgtgccaac 
tgaccacttt 
gcttgaaagg 
ctcttttccc 
ggagcaggcc 
gggagaggtc 
aggtgggtgg 
tttcaaaata 
ttcgctgttt 
atggaggtga 
gagtagtaac 
ctaaggagca 
tttctgaagg 
tctttggata 
gagttaatgc 
ccacttctcc 
tcccaagttc 
ggcacaagtc 
ttgataattc 
atgaaatgtg 



agaactgtgt 
aggataggca 
caaggcttta 
tgcgttcata 
aatggtatgg 
gggtttaatt 
caaaggactg 
atagagatga 
tgcaaattgc 
atctgtatgg 
actagcagga 
ttcttggttg 
aggctggggt 
acaatcaggt 
tgctacctca 
aggtttcttt 
aaatgtatcc 
ggactcctgg 
ttcccaaagt 
tggtgacaaa 
ttcctctgaa 
ttcaaacgaa 
tccagtgacg 
tgtttcttgt 



tcagcagtgg 

atgagatcag 

ggcaatgacc 

acccactggt 

ttttcgagtt 

ttctttacca 

tttgccaaac 

aatggatttt 

ccaggatcat 

ttagttccct 

ggactggcag 

tgattcattc 

cgaatgactt 

gttgggcctg 

gaggcattat 

gtaacagctg 

agagaagcag 

gcaagcatgg 

cctgatggcc 

ttgattgtag 

atcttgctca 

taatcctttt 

gtttcacttt 

gttgcatctt 



1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3019 



<210> 26 

<211> 1752 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> Probe 
<400> 26 

agaacgcaga ccagcccaag ctgacagctt gagtatgcct tcttctgctg cctggttttg 
ggggctgtat gacgtactgg tcggtagtaa agattaatat gtaagaaatg tggagctagg 
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atcaagtcat 


actccacagc 


ctgcctggca 


aactatgttt 


tacttctgac 


+- 4- s—r ^ ^ a**> 4- 

tttgctcccL 




cgctgagaac 


attaatctgt 


caagctggcg 


ggctcctttg 


atagcaactt 


tcccaggggc 


o a n 


atgatgtggc aatgccacct 


ctcagcccag 


gactaccgct 


attaccccgt 


ggacggctac 


jUU 


tccctgctta 


aacgcttccc 


tcttcatcct 


cttacaggac 


ccagatgccc 


tgtccaaaca 


•Jen 


gtgggacaat 


ggttggaaag 


cattgggcta 


cctcagtacg 


agaaccacct 


gatggctaat 


4 ZU 


ggatttgaca 


atgtgcagtt 


tatgggaagc 


aatgttatgg 


aagatcagga 


tttgtxggaa 


a on 


attggaatcc 


ttaattctgg 


gcacagacaa 


agaattctac aggcaatcca 


gctccttcca 


c, a n 


aagatgagac 


ccattgggca 


tgatggctac 


catcccacct 


ctgtagctga 


gtggctggat 


bUU 


tccattgaac 


tgggcgacta 


caccaaagcc 


tttctaatta atggctacac 


ttcgatggac 


bou 


ctgttgaaaa 


aaatctggga 


ggttgaactt 


attaatgttt 


taaaaatcaa 


tttgattggc 


t on 


cacaggaaac 


gtattttggc 


atctctggga 


gacaggctgc 


acgacgatcc 


cccacagaag 


i nn 
/ ou 


ccccctcggt 


ccatcaccct 


caggacagga 


gactggggag 


aaccttccat 


taccttgcga 


QACi 
OH\J 


cctccgaatg 


aagccacagc 


ctctaccccg 


gtacagtact 


ggcagcatca 


cccagaaaag 




cttatcttcc 


agtcgtgtga 


ttacaaagct 


ttttatttag gttctatgct 


gataaaagag 




cttaggggga 


cagaatcaac 


ccaagatgct 


tgtgcaaaaa 


tgcgggctaa 


ctgtcagaag 


i n o n 


tctacagagc 


aaatgaagaa 


ggtccctact 


attattcttt 


ctgtctcata 


taaaggagtc 


108U 


aaatttattg 


atgcaacaaa 


taagaacata 


attgctgagc 


atgaaattcg 


taatatctcc 


1140 


tgtgctgccc 


aggacccaga 


agacctctca 


acatttgcct 


atatcacaaa 


agatttgaag 


120O 


tctaatcacc 


actactgtca 


tgtgtttact 


gcctttgatg 


tgaatttagc 


ctatgaaatc 


i o en 

1 ©U 


atcctaaccc 


tgggacaggc 


attcgaagtc 


gcttaccagc 


tagcactaca 


agcaagaaaa 




gggggacact 


cctccacact 


tccagaaagc 


tttgaaaaca 


aaccctccaa 


acccatcccc 




aagccccgcg 


ttagcattcg 


caagtccgtg 


gatcttcttc 


atgccagcca 


taccgggcag 


i / /in 
1 1 *i u 


gaaccatcag 


agagacacac 


tgaggaggca 


ctcagaaaat 


tttgatagaa 


ULvjaaL u lac 


1500 


atatagctga 


taaagcctaa 


atagaaaggt 


tgatgtattt 


agcttgaatc 


cttcacttta 


1560 


gtggtctaat 


caacagataa 


atgacaagca 


aagaatgaaa 


caatccaaaa 


atgtttttca 


1620 


aaacaatttt 


gtgaatttta 


tttttacaaa 


aattttttaa 


attcatattt 


taaaatgtat 


1680 


accaaggcaa 


aaaaatcata 


taagctatat 


cataaataca 


agagtttcaa 


aacatacaag 


1740 


agacatataa 


tg 










1752 



<210> 27 

<211> 367 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Probe 
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<400> 27 

ccgcgttagc attcgcaagt ccgtggatct tcttcatgcc agccataccg ggcaggaacc 60 

atcagagaga cacactgagg aggcactcag aaaattttga tagaattgaa tttacatata 120 

gctgataaag cctaaataga aaggttgatg tatttagctt gaatccttca ctttagtggt 180 

ctaatcaaca gataaatgac aagcaaagaa tgaaacaatc caaaaatgtt tttcaaaaca 

attttgtgaa ttttattttt acaaaaattt tttaaattca tattttaaaa tgtataccaa 

ggcaaaaaaa tcatataagc tatatcataa atacaagagt ttcaaaacat acaagagaca 
tataatg 

<210> 28 

<211> 367 

<212> DNA 

<213> Artificial sequence 



240 
300 
360 
367 



<220> 

<223> Probe 
<400> 28 

cattatatgt ctcttgtatg ttttgaaact cttgtattta tgatatagct tatatgattt 
ttttgccttg gtatacattt taaaatatga atttaaaaaa tttttgtaaa aataaaattc 
acaaaattgt tttgaaaaac atttttggat tgtttcattc tttgcttgtc atttatctgt 
tgattagacc actaaagtga aggattcaag ctaaatacat caacctttct atttaggctt 240 
tatcagctat atgtaaattc aattctatca aaattttctg agtgcctcct cagtgtgtct 300 
ctctgatggt tcctgcccgg tatggctggc atgaagaaga tccacggact tgcgaatgct 360 



aacgcgg 

<210> 29 

<211> 2457 

<212> DNA 

<213> Homo sapiens 



60 
120 
180 



367 



<400> 29 
cacgcagcag 


gatggcaagg 


gctccgcttg gggtcctgct cctcttgggg 


cttctcggca 


60 


ggggtgtggg 


gaagaacgag 


gaactgcgtc tttatcacca tctcttcaac 


aactatgacc 


120 


caggaagccg 


gccagtgcgg 


gagcctgagg atactgtcac catcagcctc 


aaggtcaccc 


180 


tgacgaatct 


catctcactg 


aatgaaaaag aggagactct caccactagc 


gtctggattg 


240 


gaatcgattg 


gcaggattac 


cgactcaact acagcaagga cgactttggg 


ggtatagaaa 


300 


ccctgcgagt 


cccttcagaa 


ctcgtgtggc tgccagagat tgtgctggaa 


aacaatattg 


360 


atggccagtt 


cggagtggcc 


tacgacgcca acgtgctcgt ctacgagggc 


ggctccgtga 


420 
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cgtggctgcc tccggccatc taccgcagcg tctgcgcagt ggaggtcacc tacttcccct 
tcgattggca gaactgttcg cttattttcc gctctcagac gtacaatgcc gaagaggtgg 
agttcacttt tgccgtagac aacgacggca agaccatcaa caagatcgac atcgacacag 
aggcctatac tgagaacggc gagtgggcca tcgacttctg cccgggggtg atccgccgcc 
accacggtgg cgccaccgac ggcccagggg agactgacgt catctactcg ctcatcatcc 
gccggaagcc gctcttctac gtcattaaca tcatcgtgcc ctgtgtgctc atctcgggcc 
tggtgctgct cgcctacttc ctgccggcgc aggccggcgg ccagaaatgc acggtctcca 
tcaacgtcct gctcgcccag accgtcttct tgttcctcat tgcccagaaa atcccagaga 
cttctctgag cgtgccgctc ctgggcaggt tccttatttt cgtcatggtg gtcgccacgc 
tcattgtcat gaattgcgtc atcgtgctca acgtgtccca gcggacgccc accacccacg 
ccatgtcccc gcggctgcgc cacgttctcc tggagctgct gccgcgcctc ctgggctccc 
cgccgccgcc cgaggccccc cgggccgcct cgcccccaag gcgggcgtcg tcggtgggct 
tattgctccg cgcggaggag ctgatactga aaaagccacg gagcgagctc gtgtttgagg 
ggcagaggca ccggcagggg acctggacgg ctgccttctg ccagagcctg ggcgccgccg 
cccccgaggt ccgctgctgt gtggatgccg tgaacttcgt ggccgagagc acgagagatc 
aggaggccac cggcgaggaa gtgtccgact gggtgcgcat ggggaatgcc cttgacaaca 
tctgcttctg ggccgctctg gtgctcttca gcgtgggctc cagcctcatc ttcctcgggg 
cctacttcaa ccgagtgcct gatctcccct acgcgccgtg tatccagcct tagctcgcac 
cgacttcaat ttcccaccca tctccagtag gaaattgatt ttgaaaaagt aggctgccgc 
caccacggca ttatgatccc ttccccctgc tgatcaatct gcagtttgtg aacttcacaa 
gaatggtgtg tgcccgttcc ctggcgtgtg taggcctggc cgcagtccag gggtcagcag 
gaggaaaggg ttcacatagg ctctcaggtg ccagtcttcc agaaagcaag gactgccctt 
cattcagcct tgctgacctc ccagcctttc taaggctcag ccccacggga ctctggtggc 
tgccagcttg tgagctatct atctatattc atttcatagc caaacaggag acccctttgc 
aggacttgca cacagggagg ctgtagccag gaaaccctct tcttccctgg tctggctctg 
ctggagcggg tgggaaccaa acaccttcag tgctggtggc cctcaggccc acaggtttaa 
ggctgaggct gccctgaccc ttccacagtc atttcttcta ggttttcttg gcccagcact 
gcccatccca ccccatgagg ctcactcatt gcagatccca gcccaccctg cccctttctt 
ccccaccctg gaggctctct ctgcctagtc tacagtactg acagaaagca aggacatgcg 
gcctgcatgg tgggagctgg ttgaattgtc tttattaaca aacaggatat ccaaggccac 
tacattgagg aggggggagg ggggagggag gagaagggtt acttgctgct cacactatat 
acagatgcaa gcaaggggcg tggagagtga gggctccctg ctccctccct ccaccgggga 
agggcatggg ctagaagagg agaggggggt cgggaatggg gggaatgttt tggctgcggg 
gtcccccctc cattccctgg agtttggggg aaggggaatc attaaagtgc tttcaga 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 . 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2457 
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<211> 4863 

<212> DNA 

<213> Homo sapiens 



60 
120 
180 



<4O0> 30 

ggagatagcg cctgtcagtc ggtgggtcgg tcctcgcgcc ggccctcccc ctccccggtc 
tccgggggag gcgcggtgga gtccgccccc ggggttctcc gatgggggag aagcggcgac 
ggcggcagtg gagtaaccga gccggagcgt gagcggcccc ggtgccccgt tccccacgga 
ggccatgggc gacccagccc ccgcccgcag cctggacgac atcgacctgt ccgccctgcg 24 0 
ggaccctgct gggatctttg agcttgtgga ggtggtcggc aatggaacct acggacaggt 300 
gtacaagggt cggcatgtca agacggggca gctggctgcc atcaaggtca tggatgtcac 360 
ggaggacgag gaggaagaga tcaaacagga gatcaacatg ctgaaaaagt actctcacea 420 
ccgcaacatc gccacctact acggagcctt catcaagaag agccccccgg gaaacgatga 480 
ccagctctgg ctggtgatgg agttctgtgg tgctggttca gtgactgacc tggtaaagaa 54 0 
cacaaaaggc aacgccctga aggaggactg tatcgcctat atctgcaggg agatcctcag 
gggtctggcc catctccatg cccacaaggt gatccatcga gacatcaagg ggcagaatgt 
gctgctgaca gagaatgctg aggtcaagct agtggatttt ggggtgagtg ctcagctgga 
ccgcaccgtg ggcagacgga acactttcat tgggactccc tactggatgg ctccagaggt 
catcgcctgt gatgagaacc ctgatgccac ctatgattac aggagtgata tttggtctct 
aggaatcaca gccatcgaga tggcagaggg agccccccct ctgtgtgaca tgcaccccat 
gcgagccctc ttcctcattc ctcggaaccc tccgcccagg ctcaagtcca agaagtggtc 
taagaagttc attgacttca ttgacacatg tctcatcaag acttacctga gccgcccacc 
cacggagcag ctactgaagt ttcccttcat ccgggaccag cccacggagc ggcaggtccg 1080 
catccagctt aaggaccaca ttgaccgatc ccggaagaag cggggtgaga aagaggagac 1140 
agaatatgag tacagcggca gcgaggagga agatgacagc catggagagg aaggagagcc 1200 
aagctccatc atgaacgtgc ctggagagtc gactctacgc cgggagtttc tccggctcca 1260 
gcaggaaaat aagagcaact cagaggcttt aaaacagcag cagcagctgc agcagcagca 1320 
gcagcgagac cccgaggcac acatcaaaca cctgctgcac cagcggcagc ggcgcataga 1380 
ggagcagaag gaggagcggc gccgcgtgga ggagcaacag cggcgggagc gggagcagcg 14 4 0 
gaagctgcag gagaaggagc agcagcggcg gctggaggac atgcaggctc tgcggcggga 1500 
ggaggagcgg cggcaggcgg agcgcgagca ggaatacaag cggaagcagc tggaggagca 1560 
gcggcagtca gaacgtctcc agaggcagct gcagcaggag catgcctacc tcaagtccct 1620 
gcagcagcag caacagcagc agcagcttca gaaacagcag cagcagcagc tcctgcctgg 1680 
ggacaggaag cccctgtacc attatggtcg gggcatgaat cccgctgaca aaccagcctg 1740 
ggcccgagag gtagaagaga gaacaaggat gaacaagcag cagaactctc ccttggccaa 1800 
gagcaagcca ggcagcacgg ggcctgagcc ccccatcccc caggcctccc cagggccccc 1860 



600 
660 
720 
780 
840 
900 
960 
1020 
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aggacccctt tcccagactc ctcctatgca gaggccggtg gagccccagg agggaccgca 1920 

caagagcctg gtggcacacc gggtcccact gaagccatat gcagcacctg taccccgatc 1980 

ccagtccctg caggaccagc ccacccgaaa cctggctgcc ttcccagcct cccatgaccc 2040 

cgaccctgcc atccccgcac ccactgccac gcccagtgcc cgaggagctg tcatccgcca 2100 

gaattcagac cccacctctg aaggacctgg ccccagcccg aatcccccag cctgggtccg 2160 

cccagataac gaggccccac ccaaggtgcc tcagaggacc tcatctatcg ccactgccct 2220 

taacaccagt ggggccggag ggtcccggcc agcccaggca gtccgtgcca gtaaccccga 2280 

cctcaggagg agcgaccctg gctgggaacg ctcggacagc gtccttccag cctctcacgg 234 0 

gcacctcccc caggctggct cactggagcg gaaccgcgtg ggagtctcct ccaaaccgga 24 00 

cagctcccct gtgctctccc ctgggaataa agccaagccc gacgaccacc gctcacggcc 24 60 

aggccggccc gcaagctata agcgagcaat tggtgaggac tttgtgttgc tgaaagagcg 2520 

gactctggac gaggcccctc ggcctcccaa gaaggccatg gactactcgt cgtccagcga 2580 

ggaggtggaa agcagtgagg acgacgagga ggaaggcgaa ggcgggccag cagaggggag 264 0 

cagagatacc cctgggggcc gcagcgatgg ggatacagac agcgtcagca ccatggtggt 2700 

ccacgacgtc gaggagatca ccgggaccca gcccccatac gggggcggca ccatggtggt 27 60 

ccagcgcacc cctgaagagg agcggaacct gctgcatgct gacagcaatg ggtacacaaa 2820 

cctgcctgac gtggtccagc ccagccactc acccaccgag aacagcaaag gccaaagccc 2880 

accctcgaag gatgggagtg gtgactacca gtctcgtggg ctggtaaagg cccctggcaa 2940 

gagctcgttc acgatgtttg tggatctagg gatctaccag cctggaggca gtggggacag 3000 

catccccatc acagccctag tgggtggaga gggcactcgg ctcgaccagc tgcagtacga 3060 

cgtgaggaag ggttctgtgg tcaacgtgaa tcccaccaac acccgggccc acagtgagac 3120 

ccctgagatc cggaagtaca agaagcgatt caactccgag atcctctgtg cagccctttg 3180 

gggggtcaac ctgctggtgg gcacggagaa cgggctgatg ttgctggacc gaagtgggca 324 0 

gggcaaggtg tatggactca ttgggcggcg acgcttccag cagatggatg tgctggaggg 3300 

gctcaacctg ctcatcacca tctcagggaa aaggaacaaa ctgcgggtgt attacctgtc 3360 

ctggctccgg aacaagattc tgcacaatga cccagaagtg gagaagaagc agggctggac 3420 

caccgtgggg gacatggagg gctgcgggca ctaccgtgtt gtgaaatacg agcggattaa 3480 

gttcctggtc atcgccctca agagctccgt ggaggtgtat gcctgggccc ccaaacccta 354 0 

ccacaaattc atggccttca agtcctttgc cgacctcccc caccgccctc tgctggtcga 3 600 

cctgacagta gaggaggggc agcggctcaa ggtcatctat ggctccagtg ctggcttcca 3660 

tgctgtggat gtcgactcgg ggaacagcta tgacatctac atccctgtgc acatccagag 3720 

ccagatcacg ccccatgcca tcatcttcct ccccaacacc gacggcatgg agatgctgct 3780 

gtgctacgag gacgagggtg tctacgtcaa cacgtacggg cgcatcatta aggatgtggt 3840 

gctgcagtgg ggggagatgc ctacttctgt ggcctacatc tgctccaacc agataatggg 3900 

ctggggtgag aaagccattg agatccgctc tgtggagacg ggccacctcg acggggtctt 3960 

catgcacaaa cgagctcaga ggctcaagtt cctgtgtgag cggaatgaca aggtgttttt 4020 
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tgcctcagtc 


cgctctgggg 


gcagcagcca 


agtttacttc 


atgactctga 


acegtaactg 


4080 


catcatgaac 


tggtgacggg 


gccctgggct ggggctgtcc cacactggac ccagctctcc 


4140 


ccctgcagcc 


aggcttcccg 


ggccgcccct 


ctttcccctc 


cctgggcttt 


tgcttttact 


4200 


ggtttgattt 


cactggagcc 


tgctgggaac gtgacctctg acccctgatg etttegtgat 


4260 


cacgtgacca 


tcctcttccc 


caacatgtcc 


tcttcccaaa 


actgtgcctg 


tccccagctt 


4320 


ctggggaggg 


acacagcttc 


cccttcccag gaattgagtg 


ggcctagccc 


ctcccccctt 


4380 


ttctccattt 


gagaggagag 


tgcttggggc 


ttgaacccct 


taccccactg 


ctgctgactg 


4440 


ggcagggccc 


tggacceett 


tatttgeacg 


tcaggggagc 


cggctccccc 


cttgaatgta 


4500 


ccagaccctg 


gggggggtca 


ctgggcccta 


gatttttggg 


gggtcaccag 


ccactccagg 


4560 


v^^ua^^^ u. Vh## 




tttctgaaag 


cactttaatg 


attccccttc 


ccccaaactc 


4620 


cagggaatgg 


aggggggacc 


ccgccagcca 


aaacattccc 


cccattcccg 


acccccctct 


4680 


cctcttctag 


cccatgccct 


tccccggtgg 


agggagggag 


cagggagccc 


tcactctcca 


4740 


cgccccttgc 


ttgcatctgt 


atatagtgtg 


agcagcaagt 


aacccttctc 


cctccccccc 


4800 


cacccctcct 


caatgtagtg 


gecttggata 


tcctgtttgt 


taataaagac 


aattcaacca 


4860 


get 












4863 



<210> 31 

<211> 283 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> Probe 
<400> 31 

agctggttga attgtcttta ttaacaaaca ggatatccaa ggccactaca ttgaggaggg 60 
gggagggggg agggaggaga agggttactt gctgctcaca ctatatacag atgcaagcaa 120 
ggggcgtgga gagtgagggc tccctgctcc ctccctccac eggggaaggg catgggctag 180 
aagaggagag gggggtcggg aatgggggga atgttttggc tgeggggtec cccctccatt 24 0 
ccctggagtt tgggggaagg ggaatcatta aagtgctttc aga 283 

<210> 32 

<211> 283 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> Probe 
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<400> 32 

tctgaaagca ctttaatgat tccccttccc ccaaactcca gggaatggag gggggacccc 

gcagccaaaa cattcccccc attcccgacc cccctctcct cttctagccc atgcccttcc 

ccggtggagg gagggagcag ggagccctca ctctccacgc cccttgcttg catctgtata 

tagtgtgagc agcaagtaac ccttctcctc cctcccccct cccccctcct caatgtagtg 

gccttggata tcctgtttgt taataaagac aattcaacca get 

<210> 33 

<211> 2714 

<212> DNA 

<213> Homo sapiens 



60 
120 
180 
240 
283 



<400> 33 
ggcacagggc 

acaagctaag 

ccaggccaag 

gaaacatcag 

ttgeattate 

aaattacaga 

atgttcaaaa 

acattttgaa 

gcttttagag 

cttttttgat 

tggaattacc 

agagtttget 

cattatatta 

tctctttctc 

tcaggaaaca 

attagagttc 

agtggttaag 

ggtacctttt 

gattcctatg 

ggaagtaaat 

aggcattatg 

actaagcaaa 

aagttttaca 

cattctgatt 



gaggttttat 

cagcagcccc 

aagaggaaaa 

tatgaaatta 

attgaaacac 

tttaaaaatc 

gaagtctggc 

gttctgeatt 

gtatgtgaag 

agatttatgt 

tcattattca 

tacgtcactg 

aaggctttaa 

caagttgatg 

ttcattcaaa 

cagtacagaa 

aaagcctcag 

gtcaatgtag 

gaagacagac 

tacataaaca 

acaccaccga 

caagttggaa 

gaaagtagtg 

ttaaaactta 



acacctgaaa 

agcccagcca 

ctacccagga 

ggaattgttg 

ctcacaaaga 

tttttattaa 

taaacatgtt 

ctgacttgga 

tatacacact 

tgacacaaaa 

ttgcttccaa 

atggtgcttg 

aatgggaact 

ctcttaaaga 

tagctcagct 

tactgactgc 

gtttggagtg 

taaaaagtac 

ataatatcca 

ccttcagaaa 

agagcactga 

ttcaccaaga 

ctgtgattga 

caattggcac 



gaagagaatg 

gaeggaatec 

tgtcaaaaaa 

gccacctgta 

aataggaaca 

tccttcacct 

aaaaaaggag 

accacagatg 

tcatagggaa 

ggatataaat 

acttgaggaa 

cagtgaagag 

ttgtcctgta 

tgctcctaaa 

tttagatctg 

tgctgccttg 

ggacagtatt 

tagtccagtg 

gacacataca 

agggggacag 

aaaaccacca 

ttgggtagaa 

ttgccctagc 

taaagaatac 



tcaagacgaa 

ccccaagaag 

agaagagagg 

ttatctgggg 

agtgatttct 

ttgectgatt 

agcagatatg 

aggtccatac 

acattttatc 

aaaaatatgc 

atetatgetc 

gatatcttaa 

acaatcatct 

gttcttctac 

tgtattctag 

tgecatttta 

tcagaatgtg 

aagctgaaga 

aactatttgg 

ttgtcaccag 

ggaaaacact 

ctggtatcac 

caattcacaa 

atttaattat 



gtagccgttt 

cccagataat 

aggtcaccaa 

ggatcagtcc 

ccagatttac 

taagctgggg 

ttcatgacaa 

ttctagactg 

ttgeacaaga 

ttcaactcat 

ctaaactcca 

ggatggaact 

cctggctaaa 

ctcagtattc 

ccattgattc 

cctccattga 

tagattggat 

cttttaagaa 

ctatgctgga 

tgtgcaatgg 

aaagaagata 

tgaactacta 

gttacactgc 

ttcctatgtt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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agctgttaaa 

atagaagcca 

gtttatcttg 

taaagggtat 

acactcttga 

ttagttttgt 

ttgaatttct 

aacacatttt 

gtgggagtac- 

tttcttgaca 

acttaagtac 

ctaaggaatt 

ttaaggtcct 

taacacttga 

cctagaaatc 

ctaaatattt 

aactagattg 

ctatacattt 

gtttcagtat 

cttaaccttt 

tttaactgct 

cgctgccagc 



gaaacagcag 
accacagtct 
ataaactagg 
actaagtgat 
ctagtgcaat 
aataaggtga 
acaaatggtg 
ttaactaata 
caaagaaatt 
tgtaggttgc 
cctttcaaac 
ttttttttta 
ttctaaattc 
cctaaacttc 
tatttattaa 
taaattagct 
ctagtttatt 
attgttacgg 
tttgtctgaa 
taatactgtt 
taaaagtaaa 
tgta 



gacttgttta 

ataccatagc 

aattttgtca 

acagtacttt 

ttggttcttg 

ctaatttatc 

aaatttaatg 

aggcttagat 

ataaacaaga 

ttggtaataa 

tatttatatg 

atttagtgtg 

ctccattgtg 

tattttctta 

aaaaagacat 

tttctaaaaa 

ttgttatcag 

tatgaagtct 

aagaaaacac 

tatttttagc 

gttttgccat 
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caaagatgtc 
aatgtttttc 
ctggagtttt 
gaatctagtt 
aaaattaaat 
tatagctgct 
ttttttaaac 
gaacatggtg 
taaatgctgt 
cctttttgta 
aggaagtcac 
actaaggctt 
agataaggac 
aggaagaaga 
gaaaacttgc 
aaaaatccag 
atatgtgaat 
tctgtatagt 
cactaattgt 
ccattgttta 
tgcttggaga 



ttcattccca 
ctttaatcca 
ggactggata 
gttagattct 
ttaaacttgt 
atagcaagct 
tagtttattt 
ttcaacctgt 
ggctccttcc 
tatcacaatt 
tttactactc 
tatttatgtt 
agtgtcaaag 
gtattaaata 
tgtacatagg 
cctcataaag 
etcttctccc 
ttgtttttaa 
gtacatatgt 
aaaaataaaa 
aacttttttt 



aggttactgg 

gtgttactgt 

agtgctacct 

caaaattcct 

ttacaaaggt 

attataaaac 

gccttgccat 

gctctaaaca 

taactggggc 

tgggtgaaaa 

taagatatcc 

tgtgaaactg 

tgataaagct 

tatactgact 

ctagctattt 

tagattagaa 

tttgaagaaa 

actaatattt 

attatataaa 

gttaaaaaaa 

tccttctctg 



1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2714 



<210> 34 

<211> 6773 

<212> DNA 

<213> Homo sapiens 



<400> 34 
caagcatgtg 


atgttcttgt 


accttcttct 


gatagtacat 


ctcaacagtt gactccatat 


60 


agtcaagtcc 


atatttgttt gagatctggc aactatcagg aggtaataca gattttcatt 


120 


gaagacaact 


taaccttgag 


tttacctgtc 


cagttccgac 


agtcagtcct aagagaactc 


180 


tttaagaaag 


ctcaacaggg 


aaatgaagct 


ctagatgaaa 


tctgttttaa agtttgtgcc 


240 


tgtaatacag 


tccgtgatat 


actggaaggc 


agaacaatta 


gtgttcaatt taaccagcta 


300 


tttcttagac 


caaataaaga 


gaaaatagac 


tttcttcttg 


aggtatgttc aagatcagta 


360 


aatttagaaa 


aagcttcaga 


gtctttgaaa 


ggaaacatgg 


ctgcttttct aaagaatgtg 


420 
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tgtctggggt 


tggaagatct 


y Ccty La uy u u 


LLLaiyaLUL 


u u u uct uci Lyo 


crcttttcatt 

y w www w wti w w 


480 


acattgttga 


aagatgaaga 


eLCyaadyCta 


u u uy u uy a. u u 


a rra +■ rra (*rn a a 


y ay o wwww-w-w 


540 


agagtaaa tc 


tgtgcattaa 


acctgtaact 


tcattttatg 


d U d UuULayv* 


ttcaocaaot 


600 


gtcaacattg 


gtcagttaga 


gcatcaactt 


atattgtcag 


uyyauu-uu <-y 


cracrcrattaaa 


660 


caaattttaa 


ttgaattaca 


tggtatgact 


tcagagcgcc 


a fit* r*Y ctct a r* 
oy UL^Lyyau 


Qy w y w w w ci ci w 


720 


aagtgggaag 


taccttctgt 


ctatagtggt 


gttatcctgg 


ydduuddayd 


caatttaaca 

wcir w w wciu wa 


780 


agagatttgg 


tttatattct 


tatggccaaa ggtttgcact 


rri~»a «-r+* an+'rtt' 

y cay tac uy u 


uddyyduuuw 


840 


tcccatgcta 


aacagctctt 


tgctgcttgt 


ttggagttgg 


UddUdy ay uu 




900 


cttcgtcagg 


tcatgctgaa 


tqaqatgttq 


cttttggata 


W LUQ Uu^wvB 


cgaagctggg 


960 


acagggcagg 


caggagagag 


accgccatcc 


gaccttataa 


y l ay uy wa w-y 


aggctat ctg 


1020 


gaaatgaggc 


t tec t gat at 


tcctcttcgt 


caagttatag 


/— f rrannaa t" o 
w. uyoyyuo i- y 


tattaccttt 


1080 


atgttaaact 


ggagagaaaa 


tgaatacctt 


acactccaag 




tttacttcaa 

b W \J *^ W V- W- y 


1140 


agtaatccat 


atgtaaagct 


tggacagctt 


ttagcagcta 


rfli"nraaana 


acttccaaac 


1200 


cctaaagaaa 


geagaeggae 


tgecaaagae 


ctttaagaag 


ttottattca 


aa t c t Qt aa ti 


1260 


gtgtccagtc 


agcacaaacg 


aggaaatgat 


ggcagagtta 




acaaaoraaaa 

tA *w y ^ y 


1320 


tctacgttag 


gtat catgta 


tcqcraqtgaa 


ctgetttett 


U UC* u^aaaaa 


attacgagaa 


1380 


ccactcgt tt 


tgactattat 


tttatcactc 


tttgtgaaac 


L LUaUClCl L.y V. 




1440 


at tgtgaatg 


atattacagc 


tgaacacatt 


tctatttggc 


uauu.uuuua u 


L.w.w»w-«aw-w-ww- 


1500 


cagtctgtgg 


actttgaagc 


tgtggcaatc 


acagtgaaag 


age uagu teg 


afatftf ar1*r 
acaLauauLU 


1560 


agtataaatc 


caaataacca 


ttcttggtta 


attatccagg 


caga ua u u to 


U uu Ly LaaLy 


1620 


aatcagtatt 


cagcagctct 


tcactattac 


ctccaggcag 


ytiy wty uy uy 


uuui_ydw>uuw> 


1680 


tttaacaagg 


4— +- y# r% y» 

CtytyCCCCC 


tgatgtttat 


acagaccagg 


L- ci ci i— d ok o cj y 


aatgataaaa 


1740 


tgx.ugi.ucuu 


^ +- /t a ^ 4— *H /-f 

tgc tgaattg 


ccacacacag 


gtggctattt 


f af fltr'ACTl" t 

l— Cl U \-» CI y \-r 


cctcagagaa 


1800 


d L.tyauuaUa 


dddudy uy u u 


taaatctctg 


caagaacaaa 


acagtcatga 


tgctatggac 


1860 


4— /— » /— . 4-* -n 4* r*^ r~l 

LCCLaCtdCg 


dCtaUd Uct uy 


yyduyuuduu 


a t" t" t* t* nrer a a 1~ 


acttoactta 


tcttcatcat 


1920 


aaaagaggag 


aaaCayalaa 


aagacaaatt 


gcaatcaaag 


ccatcaocca 

V- d u^*vj y v_* v_/ 


gacagagttg 


1980 


aatgcaagca 


atccagaaga 


agtgttacag 


ctggcagcgc 


ay ayaayyaa 


aaaaaafltft 

OOOQoay www 


2040 


ctccaagcaa 


tggcaaaact 


ttacttttaa 


gcagttaaat 


UUUUUUaaUU 


t* t- -t* a i - 1" t- f- 1" t 

U U Ud U U U U U t- 


2100 


aaacaatggg 


ctaaaaataa 


acagtattaa 


aaggttaagt 


UUdUdUodUci 


LaLui>y w a w- o 


2160 


caattagtgg 


tgt tttcttt 


tcagacaaaa 


tactgaaaca 


dduduudy u u 


taaaaaraaa 

W CI uu CLCl W*du U 


2220 


ctatacagaa 


gact tcatac 


cgtaacaata 


aatgtatagt 


ftrfhi^anan 

LUUl.Luaaayj 


nrratraaoaoa 

yyy yy 


2280 


ttcacatatc 


tgataacaaa 


ataaactagc 


aatctagttt 


tctaatctac 


tttatgaggc 




tggatttttt 


ttttagaaaa 


gctaatttaa 


aatatttaga 


aatagctagc 


ctatgtacag 


2400 


caagttttca 


tgtctttttt 


taataaatag 


atttctagga 


gtcagtatat 


atttaatact 


2460 


cttcttcctt 


aagaaaatag 


aagtttaggt 


caagtgttaa 


gctttatcac 


tttgacactg 


2520 


tccttatctc 


acaatggagg 


aatttagaaa 


ggaccttaac 


agtttcacaa 


acataaataa 


2580 
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agccttagtc 


acactaaatt 


aaaaaaaaaa 


tgacttcctc 


atataaatag 


tttgaaaggg 


acaaaaaggt 


tattaccaag 


caacctacat 


cagcatttat 


cttgtttata 


atttctttgg 


accatgttca 


tctaagcctt 


attagttaaa 


tttaaaaaac 


attaaatttc 


accatttgta 


gcagctatag 


ataaattagt 


caccttatta 


tttaattttc 


aagaaccaaa 


ttgcactagt 


actagattca 


aagtactgta 


tcacttagta 


aaactccagt 


gacaaaattc 


ctagtttatc 


aaaaacattg 


ctatggtata 


gactgtggtt 


acatctttgt 


aaacaagtcc 


tgctgtttct 


tattctttag 


tgccaattgt 


aagttttaaa 


ctagggcaat 


caatcacagc 


actactttct 


tctacccaat 


cttggtgaat 


tccaacttgt 


ggtggttttt 


cagtgctctt 


cggtggtgtc 


tgtccccctt 


ttctgaaggt 


gtttatgtaa 


atgcactgat 


ttrcataaatc 


aaagtcaaac 


taggcctaat 


tttaatagtc 


tttttatgtc 


gtttctgtaa 


cgttatatat 


tttttaaact 


gacacttcac 


aaatcttgct 


ttgatttcaa 


aatcagtgtg 


ccccttagaa 


ctttctttcc 


ctaccatcag 


ttttggtgac 


ttactagatt 


aacataccag 


catagccaaa 


tagtttgtat 


gaatcttctt 


aaaagtcttc 


agcttcactg 


gtaccatcca 


atctacacat 


tctgaaatac 


aaagttagaa 


aagcatgaag 


gttgtacatc 


tgttgctgtt 


tttgagatat 


tttaaaagaa 


caaaacagat 


tatcactctc 


aaactattta 


agaacttctt 


aaccagtacc 


ttctacafcat 


cttaaagcag 


aactataatg 


ttgattcatt 


ttgttgatac 


acattcagga 


ttggaaagta 


taattttttt 


tttgccaaag 


ttttaaatgc 


tgtagtccca 


ttagaagttg 


tgaaaaggta 


ggctttctta 


accacttcaa 


tggaggtaaa 



44 

attccttagg gatatcttag agtagtaaag 
tacttaagtt tttcacccaa attgtgatat 
gtcaagaaag ccccagttag gaaggagcca 
tactcccact gtttagagca caggttgaac 
aaatgtgtta tggcaaggca aataaactag 
gaaattcaag ttttataata gcttgctata 
caaaactaaa cctttgtaaa caagtttaaa 
caagagtgta ggaattttga gaatctaaca 
taccctttaa ggtagcactt atccagtcca 
aagataaaca cagtaacact ggattaaagg 
ggcttctatc cagtaacctt gggaatgaag 
ttaacagcta acataggaaa taattaaatg 
atcagaatgg cagtgtaact tgtgaattgg 
gtaaaacttt agtagttcag tgataccagt 
ttgcttagtt atcttcttta gtgttttcct 
ataatgcctc cattgcacac tggtgacaac 
tttacttcct cctatacatg ggaagaaatc 
cagacttctg ggtacttatt tgagattatt 
tttgcaagtg tgaagggtca tattctgaaa 
ctttatctag actgttggca ttgatcttga 
agtaatttta ttaacttttc tacattttga 
cctgaaactg cctgaaggag tactctattc 
cagatagcaa agccaaaaaa ctcacaaaaa 
gtgtctggat attatgtctg tcttccatag 
gactagtact ttttactaca ttgacaaaag 
tgtcccactc caaacctaga tagatagaaa 
agaaactatc ttacatatgt ctgatgtact 
accaaatcat aaccaagaag tttagcatgt 
catgactatg ttgaagggaa aaaggacttc 
gaaattgaaa tggtcaaatc ccaaagaact 
tcaactgtat ttaaattcca tttggtcttt 
cttctaacag aaagataatt actgaacagc 
atgtttagca gaatgttaaa gttcagagac 
agaagacaac aaatagagag tcttacctga 
atggcacaag gcagcagcag tcagtattct 



PCT/IL02/00904 

2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4 500 

4560 

4 620 

4680 
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gtactggaac 
acaaaagtac 
gggatctaga* 
cagcactttg 
ctaacacggt 
acgcgcctgt 
tgcagagctt 
gactccgtct 
actgttgcaa 
gatagtgatg 
taggaggcag 
aactcctaag 
gtggtggcta 
attttatagg 
aaagatacct 
tgagattagt 
tggtatagtg 
aaacttcaat 
ggaattggaa 
ttatcaaaga 
taggaattct 
tcagtatact 
tgcccacaaa 
gattttcaca 
atacagcagc 
tttttttttt 
agtgcgatct 
gcctcccaag 
atctgcctgc 
cggccagatc 
tttactttga 
taacttgata 
cattataaag 
gagatacttt 
tttttctaga 



tctaatgaat 
aagcaattta 
gataaataag 
ggaggccgag 
gaaaccccgc 
agtcccagct 
gcagtgagca 
caaaaaaaaa 
tacagtgtga 
gcctcaaatt 
agaatgttct 
gtgaaaaata 
gaaagaggca 
ctgattcttg 
tagaatgcag 
taaaaggctg 
taaaaggggt 
tctctttgtc 
gtgaaatatt 
tttaccatca 
ggatgaaaaa 
tagtggttat 
ttttaaccta 
tcccacatag 
cttaagatta 
ttaatttttt 
cggctcactg 
taacatgttg 
ctgcctcagc 
ctcctcctcc 
atttatgttc 
ggcactctgt 
cttaaacaca 
agtttggact 
aatattaagc 



caatggctag 
ggagaaagat 
ccaccacccg 
gcgggtggat 
ctctactaaa 
actcgggagg 
gagatcgcgc 
aaaaacaagc 
catgtactag 
gccataaatg 
aggtataggg 
aataaggagt 
tggtctagat 
aagctactgg 
tgaagcagac 
tataatctag 
agatttcaca 
ctcattggtc 
actgttatat 
ctatcagaag 
attaagcttt 
ccaatttgag 
ggtgacttaa 
aataagaggg 
cttacgagaa 
gagatggagt 
caacctccac 
gctaggctgc 
cgcccaaagt 
tctacttact 
taaaaaattt 
gtatccaaat 
actggcgaaa 
cctaaagaat 
aacataaaca 
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aatacacaga 



gagtactaaa 
gccaggcgcg 
cacaaggtca 
aaatacaaaa 
ctgaggcagg 
cactgcactc 
caccaacctg 
caggaagggc 
ggtcttaaaa 
acattacttg 
accttcattt 
tggatcacaa 
aagattttta 
agactagaag 
gcaataagag 
gatttgagaa 
cagaaggtag 
acctctagaa 
ggtatagctg 
taataaaaag 
tattcataat 
taattatccc 
tagattttct 
gtaagcaaga 
cttgctctgt 
ctcccaggtt 
ctcagccgcc 
gctgagatta 
tactttgtta 
ttttaacaaa 
gtaaagacat 
aaaatgcttt 
gaaagtactc 
ctggggacag 



tctaaaagct aacaggaaaa 
tgtctcttgc taaaacctta 
gtggctcacg cctgtaatcc 
cgagatcgag accatcctgg 
aattagctgg gcttggtggc 
agaatggcgt gaacccggga 
cagcctgggc gaaagagcga 
aaggaagtag acaaggaagg 
acctaatcca gattggaaaa 
gataagggag ccaggaagag 
gaactcagtt cacagttcag 
cttatcaaga aagatgaggg 
agggtcttta agaagtcaga 
aatcaaagtt ccattttaag 
aaaacatgtt tattaagcag 
ctgaactagt agcagtggaa 
gatacttgtg cagtggaatt 
gagaaatggg agaagagctg 
agtccacatt gtttatcggc 
cctaggacaa tttgggatgc 
ttttataaaa taaaccaatt 
gtgctagatt taagcaccac 
caaatgtctt ccatatgtta 
tcacttttgt tatatggcag 
aagaatggga tctcctcttt 
tgctcaggct ggagtgcagt 
ccagcgattc tcctgcctca 
caaactcctg acctcaagtg 
gagacctgag ccacagtgcc 
aatatgctag cctggaaaag 
gtaattttaa ttctgatatt 
catacagaat aattctatgc 
tccccatttt atatcaaaaa 
agaaaagtgt aaggactttg 
aactttatgc gtc 



4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6773 
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<210> 35 

<211> 1590 

<212> DNA 

<213> Mus musculus 



<400> 35 
ctgagaacca 


gacatcagga 


tggcaggggc tctgcttggt 


gccctgcttc 


tcctgacact 


60 


ctttggcaga 


agccagggaa 


agaatgaaga gcttagcctg 


tatcaccatc 


tcttcgacaa 


120 


ttatgatcca 


gaatgccggc 


cagttaggag 


acctgaggac 


actgtcacca 


tcaccctcaa 


180 


ggtcacccta 


accaacctca 


tctcactgaa 


cgagaaagaa 


gaaactctga 


ccaccagtgt 


240 


ctggattggc 


attgactggc 


acgactatcg 


gctcaactac 


agcaaggacg 


attttgcagg 


300 


tgtaggaatc 


ctccgggtcc 


cttcagaaca 


tgtatggctg 


ccagagattg 


ttctagaaaa 


360 


caatattgat 


gggcagtttg 


gagtggccta 


cgacagcaat 


gttctagtct 


atgagggagg 


420 


ctatgtgagc 


tggttgcccc 


cagccatcta 


ccgcagcacc 


tgcgcagtgg 


aggtcaccta 


480 


tttccccttt 


gactggcaga 


actgctctct 


catttttcgc 


tcccagacct 


acaatgctga 


540 


ggaggtggag 


ttcatctttg 


ccgtggatga 


cgacggcaat 


accatcaaca 


agattgacat 


600 


tgacacggca 


gcttttaccg 


agaatggaga 


atgggccata 


gactactgcc 


caggcatgat 


660 


tcgccgctat 


gagggaggtt 


ccacagaagg 


tcctggagaa 


actgacgtca 


tctatacgct 


720 


catcatccgc 


cggaagccgc 


ttttttacgt 


cattaacatc 


attgtgcctt 


gcgtgctcat 


780 


ttctggcttg 


gtgctgctcg 


cttacttcct 


gcctgcgcag 


gctggtggcc 


agaaatgcac 


840 


ggtctctatc 


aacgtcctgc 


tagcccagac 


tgtcttcttg 


tttctaattg 


cccagaaaat 


900 


tccagagact 


tctctgagcg 


tgccactgct 


gggcaggtat 


cttatattcg 


tcatggtggt 


960 


tgccacgctc 


attgtcatga 


attgcgtcat 


cgtgctcaac 


gtatctttga 


ggacgccaac 


1020 


gactcatgct 


acatcccctc 


ggctgcgcca 


gattttatta 


gagctgctgc 


cgcgtctcct 


1080 


gggctcgagc 


ccacccccag 


aggatccccg 


aactgcctca 


ccagcgaggc 


gtgcctcgtc 


1140 


tgtgggcatt 


ctgctcagag 


cggaggagct 


catcttgaaa 


aagccgcgga 


gcgaactcgt 


1200 


gtttgagggt 


cagaggcatc 


ggcacggaac 


ttggaccgca 


gccctctgcc 


agaacctggg 


1260 


tgctgcagcc 


ccagaaatcc 


gctgctgtgt 


ggatgctgtg 


aactttgtgg 


ctgagagcac 


1320 


aagagaccag 


gaagccactg 


gagaggaact 


gtccgactgg 


gtgcgtatgg 


ggaaggccct 


1380 


ggacaatgtc 


tgtttttggg 


cagctttggt 


gctcttcagc 


gttggttcta 


ctctcatctt 


1440 


ccttgggggt 


tacttcaacc 


aagttcctga 


tctcccttac 


ccaccgtgca 


tccaaccatg 


1500 


agcctgcact 


ggcacccacc 


tctcccccac 


cccccaagaa 


agagattttg 


aaaacaggcc 


1560 


gctgacaata 


aatctggttt 


gtgaacttgc 








1590 



<210> 36 
<211> 2227 
<212> DNA 



BNSDOCID: <WO_ 03046220A1 J > 



WO 03/04622(1 



47 



PCT/IL02/00904 



<213> Mus musculus 



360 
420 
480 



<400> 36 

tgtgagcagc aagtagccct tctccctcct gtatcctttc tcaatgtagt ggccttggat 60 

atatcccctt tgttaataaa gacaattcaa ccagcttcca ccattttgag atcctactat 120 

tgttctctct caatcctgga gagatttgag agttgagaat gcagagggta gaggaaaggc 180 

attaggctct gtgaagttac tgtgataata gagacgaagt aaggtggatg aataggccag 240 

ggatcagtcc tgacacggta ggaccctttg agaatagttt ttaccagccc cagcagggcc 300 
aggccagact tctggcttca gtgtttctat atctgggtct tgtaaaaacc tcattggcta 
tcaactagat aaacattctt taggttagaa ggagccaaga gcaaaattga accaattgcc 
tccaagtgcc tgaccaaacc acccacccat cttctacttc cctgaggagt tggacccacc 

cacatgacca cacaacccct cgggcagttc acaaaccaga tttattgtca gcggcctgtt 54 0 

ttcaaaatct ctttcttggg gggtggggga gaggtgggtg ccagtgcagg ctcatggttg 600 

gatgcacggt gggtaaggga gatcaggaac ttggttgaag taacccccaa ggaagatgag 660 

agtagaacca acgctgaaga gcaccaaagc tgcccaaaaa cagacattgt ccagggcctt 720 

ccccatacgc acccagtcgg acagttcctg tgagagagag cttagcgagg gaggagcctg 780 

gagggcgggg catctagcac tgctccgcct caacctccca acccacctct ccagtggctt 840 

cctggtctct tgtgctctca gccacaaagt tcacagcatc cacacagcag cggatttctg 900 

gggctgcagc acccaggttc tggcagaggg ctgctgctaa ggcaacagca agcgctaggt 960 

cattaaaaga gcgtcctaac ggcgagtgta tgcctttgac ccaagagcag tgcttaccgg 1020 

tccaagttcc gtgccgatgc ctctgaccct caaacacgag ttcgctccgc ggctttttca 1080 

agatgagctc ctccgctctg agcagaatgc ccacagacga ggcacgcctc gctggtgagg 1140 

cagttcgggg atcctctggg ggtgggctcg agcccaggag acgcggcagc agctctaata 1200 

aaatctgcag ccggggcaga gagaggttcc aagcccgctt cccacccctg ggcagtactt 1260 

tctccaacca gcgcttacct ggcgcagccg aggggatgta gcatgagtcg ttggcgtcct 1320 

caaagatacg ttgagcacga tgacgcaatt catgacaatg agcgtggcaa ccaccatgac 1380 

gaatataaga tacctgatat acagaagcct gatgtcacag caccccacaa acaaggcact 1440 

agctgccctc tacctcacaa ataccacctc gcacagctgg tggcgttact tcttgatcct 1500 

cctcaacgat gccagtattg tcctggccct tctgcatata ccatctgttg cggacatgaa 1560 

ggggattccc agcaatttgg acaccctgct gtgggtctac cacttccaca gctccaccga 1620 

ggtgagggta ttagaatggc agaatctgga gaggtcccca gctcttcctg ctatggccct 1680 

ttccatgtga tcattccact cactaccctt gctcctccag gtggccttac agcctccact 1740 

tctatcttcc ctggaacttg ctgtggccgc agctcacgaa tatctggtgc aaaggttcag 1800 

agagcttaag tcccaggacc ccctggaatc cgacaagtcg cccacccaga aggccaccct 1860 

agggctggtg ctaagagaag ctgcagccag catcatgagc tttggagcca ccttgttaga 1920 

ggtgctgctc tgggaggctg agggatggga ataaaagggg gagagggcta ggccaacaaa 1980 
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agcaaggacc tctagcccat atgccccaat gtagatctcg gccctgtggc tgcagcagga 
ggtgcagcga ctggacggcg gcaacgactg cccaggccca gccccagaca ctggggatcc 
tggtagggcg ctggcccgtg tagccctggc cgcagggcag gggattcggc aagctggaac 
ggcagctggc gcaagtgccc ggtacctgat ccagggggcg tggttgtacc tgtgtggacg 
aggtttg 



2040 
2100 
2160 
2220 
2227 



<210> 37 

<211> 2472 

<212> DNA 

<213> Homo sapiens 



<400> 37 



agcatcgagt 


cggccttgtt 


gcctactgga 


gtctccgcag 


agcccgggcg 


ggagtagctg 


60 


gtggaccccg 


ttgagctgcc 


gaacttccgg 


gactcccccg 


cgaccccttc 


ccagcttccc 


120 


gtccgctccg 


ccgcagcgat 


tgtctcggtg 


ggttgattcg 


gcacaaaccg 


cccgacccag 


180 


gggccggtgc 


gcgtgtggaa 


ggggaagcac 


tcccctcgtg 


gtcgcctgga 


ggtgcgctgg 


240 


aggagggggt 


gacataacca 


gggactcgag 


gtccgccgtg 


ggaatgatcc 


acgaactgct 


300 


cttggctctg 


agcgggtacc 


ctgggtccat 


tttcacctgg 


aacaagcgga 


gtggcctgca 


360 


ggtatcgcag 


gacttccctt 


tcctccaccc 


cagtgagacc 


agtgtcctga 


atcgactctg 


420 


ccggctcggc 


acagactata 


ttcgcttcac 


tgagttcatt 


gaacagtaca 


cgggccatgt 


480 


gcaacagcag 


gatcaccatc 


catctcaaca 


gggccaaggt 


gggttacatg 


gaatctacct 


540 


gcgggccttc 


tgcacagggc 


tggattctgt 


tttgcagcct 


tatcgccaag 


cactgcttga 


600 


tttggaacaa 


gagttcctgg 


gtgatcccca 


tctctccata 


tcacatgtca 


actacttcct 


660 


agaccagttc 


cagcttcttt 


ttccctctgt 


gatggttgta 


gtagaacaaa 


ttaaaagtca 


720 


aaagattcat 


ggttgtcaaa 


tcctggaaac 


agtctacaaa 


cacagctgtg 


gggggttgcc 


780 


tcctgtUcga 


agtgcactgg 


aaaaaatcct ggccgtttgt 


catggggtca 


tgtataaaca 


840 


gctctcagcc 


tggatgctcc 


atggactcct 


cttggaccag 


catgaagaat 


tctttatcaa 


900 


acaggggcca 


tcttctggta 


atgtcagtgc 


ccagccagaa 


gaggacgagg 


aggatctggg 


960 


cattggggga 


ctgacaggaa 


aacaactgag 


agaactgcag 


gacttgcgcc 


tgattgagga 


1020 


agagaacatg 


ctggcaccat 


ctctgaagca 


gttttcccta 


cgagtggaga 


ttttgccatc 


1080 


ctacattcca 


gtgagggttg 


ctgaaaaaat 


cctatttgtt 


ggagaatctg 


tccagatgtt 


1140 


tgagaatcaa aatgtgaacc tgactagaaa aggatccatt ttgaaaaacc aggaagacac 


1200 


ttttgctgca gagctgcacc gtctcaagca gcagccactc 


ttcagcttgg 


tggactttga 


1260 


acaggtggtg 


gatcgcattc 


gcagcactgt ggctgagcat 


ctctggaagt 


tgatggtaga 


1320 


agaatccgat 


ttactgggtc 


agctgaagat 


cattaaagac 


ttttaccttc 


tgggacgtgg 


1380 


agaactgttt 


caggccttca 


ttgacacagc 


tcaacacatg 


ttgaaaacac 


cacccactgc 


1440 
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agtaactgag 


catgatgtga 


atgtggcctt tcaacagtca 


gcacacaagg 


tattgctaga 


1500 


tgatgacaac 


cttctccctc 


tgttgcactt gacaatcgag 


tatcacggaa 


aggagcacaa 


1560 


agcagatgct 


actcaggcaa 


gagaagggcc ttctcgggaa 


acttctcccc 


gggaagcccc 


1620 


tgcatctggc 


tgggcagccc 


taggtctttc ctacaaagta 


cagtggccac 


tacatattct 


1680 


cttcacccca 


gctgtcctgg 


aaaaaaatag acaattttaa 


aaaccaaaca gaatgggact 


1740 


gtcttctgca 


agcctaccta 


caaacaggta caatgttgtt 


tttaagtact 


tactgagtgt 


1800 


gcgccgggtg 


caagctgagc 


tgcagcactg ctgggcccta 


caaatgcagc 


gcaagcacct 


1860 


caagtcgaac 


cagactgatg 


caatcaagtg gcgcctaaga 


aatcacatgg 


catttttggt 


1920 


ggataatctt 


cagtactatc 


tccaggtaga tgtgttggag 


tctcagttct 


cccagctgct 


1980 


tcatcagatc 


aattctaccc 


gagactttga aagcatccga 


ttggctcatg 


accacttcct 


2040 


gagcaatttg 


ctggctcaat 


cctttatcct attgaaacct 


gtgtttcact 


gcctgaatga 


2100 


aatcctagat 


ctctgtcaca gtttttgttc gctggtcagt cagaacctag gcccactgga 


2160 


tgagcgtgga 


gccgcccagc 


tgagcattct cgtgaagggc 


tttagccgcc agtcttcact 




cctgttcaag 


attctctcca 


gtgttcggaa tcatcagatc 


aactcagatt 


tggctcaact 


2280 


actgttacga 


ctagattata 


acaaatacta tacccaggct 


ggtggaactc 


tgggcagttt 


2340 


cgggatgtga 


aaatttctgg 


ctcataaatt gaaataacag 


ccacgttccc 


aaggttgtaa 


2400 


cagaagattc 


aaaacatccc 


attctagcca cacacaaata 


aatatctgcg gcttaaaaaa 


2460 


aaaaaaaaaa 


aa 








2472 



<210> 38 

<211> 4165 

<212> DNA 

<213> Homo sapiens 

<400> 38 

agcatcgagt cggccttgtt gcctactgga gtctccgcag agcccgggcg ggagtagctg 
gtggaccccg ttgagctgcc gaacttccgg gactcccccg cgaccccttc ccagcttccc 
gtccgctccg ccgcagcgat tgtctcggtg ggttgattcg gcacaaaccg cccgacccag 
gggccggtgc gcgtgtggaa ggggaagcac tcccctcgtg gtcgcctgga ggtgcgctgg 
aggagggggt gacataacca gggactcgag gtccgccgtg ggaatgatcc acgaactgct 
cttggctctg agcgggtacc ctgggtccat tttcacctgg aacaagcgga gtggcctgca 
ggtatcgcag gacttccctt tcctccaccc cagtgagacc agtgtcctga atcgactctg 
ccggctcggc acagactata ttcgcttcac tgagttcatt gaacagtaca cgggccatgt 
gcaacagcag gatcaccatc catctcaaca gggccaaggt gggttacatg gaatctacct 
gcgggccttc tgcacagggc tggattctgt tttgcagcct tatcgccaag cactgcttga 
tttggaacaa gagttcctgg gtgatcccca tctctccata tcacatgtca actacttcct 
agaccagttc cagcttcttt ttccctctgt gatggttgta gtagaacaaa ttaaaagtca 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
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aaagattcat ggttgtcaaa 
tcctgttcga agtgcactgg 
gctctcagcc tggatgctcc 
acaggggcca tcttctggta 
cattggggga ctgacaggaa 
agagaacatg ctggcaccat 
ctacattcca gtgagggttg 
tgagaatcaa aatgtgaacc 
ttttgctgca gagctgcacc 
acaggtggtg gatcgcattc 
agaatccgat ttactgggtc 
agaactgttt caggccttca 
agtaactgag catgatgtga 
tgatgacaac cttctccctc 
agcagatgct actcaggcaa 
tgcatctggc tgggcagccc 
cttcacccca gctgtcctgg 
ccgggtgcaa gctgagctgc 
gtcgaaccag actgatgcaa 
taatcttcag tactatctcc 
tcagatcaat tctacccgag 
caatttgctg gctcaatcct 
cctagatctc tgtcacagtt 
gcgtggagcc gcccagctga 
gttcaagatt ctctccagtg 
gttacgacta gattataaca 
gatgtgaaaa tttctggctc 
aagattcaaa acatcccatt 
actctacctt ttctcctaga 
attcccatgt ggaagggtct 
attgagaaca tttgttggat 
cgtaccttgg tactgttcaa 
aaagttaaat attttatggt 
tgctggatgt taccaccaag 
accaagtaat ttatacctac 



50 

tcctggaaac agtctacaaa 
aaaaaatcct ggccgtttgt 
atggactcct cttggaccag 
atgtcagtgc ccagccagaa 
aacaactgag agaactgcag 
ctctgaagca gttttcccta 
ctgaaaaaat cctatttgtt 
tgactagaaa aggatccatt 
gtctcaagca gcagccactc 
gcagcactgt ggctgagcat 
agctgaagat cattaaagac 
ttgacacagc tcaacacatg 
atgtggcctt tcaacagtca 
tgttgcactt gacaatcgag 
gagaagggcc ttctcgggaa 
taggtctttc ctacaaagta 
aaaagtacaa tgttgttttt 
agcactgctg ggccctacaa 
tcaagtggcg cctaagaaat 
aggtagatgt gttggagtct 
actttgaaag catccgattg 
ttatcctatt gaaacctgtg 
tttgtttgct ggtcagtcag 
gcattctcgt gaagggcttt 
ttcggaatca tcagatcaac 
aatactatac ccaggctggt 
ataaattgaa ataacagcca 
ctagccacac acaaataaat 
agcagttact gaacatccag 
ctcccatcaa ggagaacatg 
atgttcattt attcaatagt 
gctgtgggag atacagcggt 
tcatatgtga aaaagtaatt 
taagaaagca acaggtaaga 
acagattggg caattctagc 



cacagctgtg gggggttgcc 
catggggtca tgtataaaca 
catgaagaat tctttatcaa 
gaggacgagg aggatctggg 
gacttgcgcc tgattgagga 
cgagtggaga ttttgccatc 
ggagaatctg tccagatgtt 
ttgaaaaacc aggaagacac 
ttcagcttgg tggactttga 
ctctggaagt tgatggtaga 
ttttaccttc tgggacgtgg 
ttgaaaacac cacccactgc 
gcacacaagg tattgctaga 
tatcacggaa aggagcacaa 
acttctcccc gggaagcccc 
cagtggccac tacatattct 
aagtacttac tgagtgtgcg 
atgcagcgca agcacctcaa 
cacatggcat ttttggtgga 
cagttctccc agctgcttca 
gctcatgacc acttcctgag 
tttcactgcc tgaatgaaat 
aacctaggcc cactggatga 
agccgccagt cttcactcct 
tcagatttgg ctcaactact 
ggaactctgg gcagtttcgg 
cgttcccaag gttgtaacag 
atctgcggct tagtgatagg 
gagtacaact ccttcccatc 
tggcatctct gatcctttac 
catttattga gcacctacta 
agacaaacaa tatagagcag 
atgtttataa atagactaac 
taggctttct ctctccctat 
taatgaaaat atacttaaaa 
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780 

840 

900 

960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
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gtatttctta ggccgggcat ggtggctcac 
ggcgggcgga tcacctgaag tcaggagttt 
cgattctact aaaaatacaa aaattagcca 
tactcaggag gctgagacag gagaattgct 
tgagattgtg ccattgcatt ccagcctggg 
aaaaaaaaaa aaaaagtatt attctccaag 
agttgttaga tttttaaata ctgaagattg 
taggggttga agttatctta atatggccca 
tgtaagtaaa aagaaatatt cactgaacaa 
tctggcatca ggttatagtc actgcatctg 
gggaagctct gacaacttat tccctgctat 
gtctctggag caggagctgg caaactatgg 
aaacacagcc gtgcccattt gtttactcat 
aaaggcgagt agttgtgatg gatcaaatgg 
ccctttacag aaaaaaacct tgttgacccc 
tcagtgatgc cagaggaagg gaaggaactg 
ataatattgg gtctttgact agaacgtgta 
catgtttatc ttacggaagg tcattccatc 
ttggtccttt cgttctccct ttagctctaa 
atctcagctc agagagagag catgaggtct 
caattccact caacttttgg cacaactgtt 
tttctgcaag catagcattt tagacaccct 
ataaagttca ctcttaactt ttcaa 

<210> 39 

<211> 27 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic oligonucleotide 

<400> 39 

ggagagaacc acccagccca gaagttc 

<210> 40 
<211> 23 
<212> DNA 



51 

acctgtaatc ccagcacttt gggaggccga 
gagaccagcc tgaccaacat gatgaaacct 
ggtgtggtgg catgtgcctg taatcccagc 
tgaacctggg aagcagacgc tgcagtgagc 
caacaagagc gaaattccgt ctcaaaaaaa 
aaaaaggtcc ttaagaaaaa attgagatca 
caggcccaat tacccatctt acacaaacca 
gccatcactg gtaatcaata ttcatatcag 
cgccctccaa actgaaaaag aatgcagtgt 
gttttcatca ctacatattc tacacacact 
tatcaactaa agatcaccct ttccactgct 
cctgctgtct gtttttgtac agttttactg 
tgtctatggt tgctttcatg ccctcacagc 
cccacaaagc ctgaaatatt tactctttga 
tgctttagag aatgagaagc catgcaggga 
cttccagcta ttgtgacaat aataataata 
acatttccag gtgttctcac ttgtgcttcc 
aagcttatgg tcactgtccc ttcatggcag 
gagttgggga gtacccacag gtgagctgtg 
tttttaactg tcaggaaaca gagctgtgcc 
aatctgggcc ttcacctacc ttaaactgag 
ggaataacct tttgggaatg atgccacaga 
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2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4165 
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<213> Artificial sequence 



<220> 

<223> Synthetic oligonucleotide 
<400> 40 

aggaatggag gcggcccttc tgc 23 

<210> 41 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic oligonucleotide 

<400> 41 

cggaggagct catcttgaaa aag 23 

<210> 42 

<211> 24 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic oligonucleotide 

<400> 42 

gatcaggaac ttggttgaag taac 24 

<210> 43 

<2H> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic oligonucleotide 

<400> 43 

tgtgagcagc aagtaaccct tctcc 25 

<210> 44 

<211> 793 

<212> DNA 
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<220> 

<223> Probe 
<400> 44 



acagagttga 


atgcaagcaa 


tccagaagaa 
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